> 1097 
g, Gi? 


sychometrika 


A JOURNAL DEVOTED TO THE DEVEL- 
OPMENT OF PSYCHOLOGY AS A 
QUANTITATIVE RATIONAL SCIENCE 












































) 








THE PSYCHOMETRIC SOCIETY - ORGANIZED IN 1935 








OLUME 1 
NUMBER 1 


ARCH 
7 36 

















GENERAL INFORMATION CONCERNING PSYCHOMETRIKA 


Psychometrika is the official journal of the Psychometric Society. This Society 
was organized by a group of psychologists in February 1935, and was affiliated 
with the American Psychological Association in September 1935. Psychometrika 
will contain articles on the following subjects: 


(1) the development of quantitative rationale for the solution of psycho- 
logical problems, 
(2) new mathematical and statistical techniques for the evaluation of 
psychological data, 
(8) aids in the application of statistical techniques, such as monographs, 
tables, work-sheet layouts, forms, and apparatus, 
(4) critiques or reviews of significant studies involving the use of quanti- 
tative techniques, 
(5) general theoretical articles on quantitative methodology in the social 
and biological sciences. 
The emphasis is to be placed on articles of type (1), in so far as articles of that 
type are available. 
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MATHEMATICAL BIOPHYSICS AND PSYCHOLOGY 
N. RASHEVSKY 


I 


In building a system of mathematical biophysics, as the foundation 
of which we took the phenomena of cellular multiplication, we have 
had naturally several occasions to discuss also various physico-mathe- 
matical aspects of psychological phenomena.' This is not surprising, 
since the latter forms a prominent branch of general biology. There 
is, however, still a big difference in our treatment of biophysics of 
gereral biology and in the treatment of biophysics of psychology. In 
the former we started with a rather general concept of a metabolis- 
ing system and developed therefrom in an almost purely deductive, 
synthetic way an elaborate abstract theoretical system of biophysics, 
which incidentally happened to throw interesting light on actual real 
phenomena of cellular biology. We intentionally underline here the 
word “incidentally”, to emphasize that, although we consider the de- 
velopment of mathematical biophysics as eventually of greatest im- 
portance for the interpretation of empirical biology, we do not con- 
sider this “utilitarian” aim as the principai driving motive for our 
study. Like any other theoretical science, mathematical biophysics has 
a right to existence of its own, and its interest li. ‘not merely in the 
number of empirical facts which it can explain, but in its internal 
logical consistency and mathematical beauty. As a consolation for 
the ‘‘fact-seekers” we have many times pointed out that usually such 
pure theoretical studies bear most unexpected practical fruits. But 
to us this is really beside the point. 

In our previous studies of the biophysics of psychological phe- 
nomena we have not come as close to a purely synthetic development 
of the subject as in the case of cell biophysics. Our investigations 
had rather a character of attempts to orient ourselves in the various 
possibilities of a physico-mathematical approach to this field.2 We 
believe that the time has come for an attempt at a purely theoretical 
biophysical psychology. 

The ideal situation would be of course to develop this branch of 
mathematical biophysics as a logical continuation of the biophysics of 

1 Rashevsky, N., Foundations of Mathematical Biophysics, Phil. of Sc. 1, 176, 


1934; Rashevsky, N., Mathematical Biophysics, Nature, 135, 528, 1985. 
2 Jl. Gen. Psych. 5, 207 and 368, 19381; 13, 308, 1935, Phil. of Sc. 1, 409, 1934. 
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the cell, inasmuch as the cell remains still the fundamental unit even 
in the central nervous system. Such is the ultimate remote goal of 
mathematical biophysics as we see it. At present however we must 
be somewhat more modest, and develop biophysical psychology on 
the basis of postulates which are not yet directly reducible to the 
fundamental postulates of cell biophysics. This of course represents 
a certain limitation, but this limitation is not of a very essential char- 
acter. After all, no matter what general postulate we put at the be- 
ginning, we have to set a limit to ourselves, lest we never come to the 
bottom. In starting the mathematical biophysics of the cell by a con- 
sideration of the rather general concept of a metabolizing system, we 
do not ask why metabolizing systems exist at all, and whether we 
should not “deduce” the very existence of such metabolizing systems 
from some still more general postulates. Yet such a question is logi- 
cally quite legitimate, and attempts of that nature will probably be 
made in the future. Our initial fundamental concept will cease to be 
initial and fundamental, but the mathematical deductions from it, 
which constitute the body of mathematical biophysics, will remain in- 
tact. 

With those remarks in mind, we shall now proceed to the choice 
of the fundamental concept which we shall use as a foundation for 
building the mansion of mathematical biophysics of psychological phe- 
nomena. Although the system to be built will be a purely theoretical 
one, yet again our choice of the fundamental concept shall not be 
quite arbitrary, but will be guided by suggestions given us by Nature. 
We shall again arrive at our fundamental concept by abstracting as 
much as possible from the complexity of the real objects. 

The differences between the central nervous systems of differ- 
ent organisms are about as great as the differences between various 
kinds of cells. Looking for criteria common to absolutely all types of 
central nerve systems, we first find that they all consist of geometri- 
cally individual conducting elements, in intimate contact with each 
other. The physiological nature of this contact is for our general pur- 
pose irrelevant, and we do not therefore enter into the controversial 
question of the continuity of all nerve paths, versus an actual synap- 
tic discontinuity. The geometrical individuality of the conducting ele- 
ments is not affected by those considerations. 

Stimulation of some of those elements produces an excitation in 
them, which is conducted away and may be transmitted to several 
other elements along which it is brought to efferent organs, in which 
it produces a reaction. Stimulation of some other elements is not nec- 
essarily transmitted to any end-organ. Finally, stimulation of still 
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different nerve elements results in an inhibition of other neural ele- 
ments. We thus are led to consider at the start two different kinds of 
nerve elements, excitatory and inhibitory. The bulk of experimental 
evidence makes it rather probable that the excitatory and inhibitory 
action of one neural element on the other is due to two different kinds 
of substances, an excitatory and an inhibitory one. We shall there- 
fore begin by investigating the theoretical consequences of the as- 
sumption of two such substances. As regards the propagation of the 
excitation along an individual conducting element, we shall take the 
accepted view, that it is due to a reexcitation of adjacent regions by 
local bioelectric currents. Both experimental evidence and recent the- 
oretical studies favor this view. 

We also assume the all-or-nothing law. Any nerve possesses a 
finite threshold, which the intensity of the stimulus must exceed in 
order to produce excitation. But, once produced, the excitatory phe- 
nomenon does not depend on the intensity of the stimulus, but only on 
the physico-chemical nature of the nerve. 

Since we are building a purely theoretical science, we do not con- 
sider any of the above assumptions as having necessarily a counter- 
part in reality. We are investigating all possible cases, and merely 
for the sake of definiteness we begin with this one. In future publi- 
cations we shall just as systematically investigate other possible al- 
ternatives. 

About the action of the two substances we can again make a 
number of hypotheses, and again we shall confine ourselves in this 
paper to a particular one, without prejudice to other possible ones. 
We shall consider the case, that the transmission of excitation from 
neurone to neurone takes place in two ways: first by local bioelectric 
currents produced by adjacent neurones, and second by the excita- 
tory substance. We shall consider that the excitation by the latter 


depends on the ratio : of the concentration of the excitatory and in- 


hibitory substances, so that when this ratio exceeds a critical value 
excitation occurs. Without any loss of generality, we may assume 
this critical value to be equal to 1. Then excitation occurs when 
e —i> 0; when e —i < 0 the effect is an inhibitory one. We shall 
consider the general case, that a finite time 7; is required to inhibit a 
nerve when e —i < 0, this being a function of e — 7. 

In biological phenomena we most frequently find that the action 
of any agency, physical or chemical, is not merely characterized by 
its instantaneous effects, but by certain after-effects. Such after-ef- 
fects are not the monopoly of the living. We find them quite srenerally 
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in all physical phenomena.’ As a matter of fact, we would not be 
too far from truth in asserting that almost in every physical phe- 
nomenon after-effects do exist, and if we usually do not observe them 
it is because of their very short duration. In some physical phenom- 
ena, and especially in biology, those durations are much longer. 

We therefore shall consider quite generally that the threshold 
of a nerve depends on whether it has been previously excited or not. 
Two alternatives are possible. One is that the threshold of a nerve is 
permanently increased by excitation; the other is that it is lowered. 
We shall, again without prejudice to the other possibility, consider 
here the latter case. 


II 


It will be now our task to develop the mathematical consequences 
of this picture of the central nervous system, by investigating vari- 
ous special cases of this general concept and by gradually complicat- 
ing those cases. 

In the present paper we shall confine ourselves to more or less 
outlining a large part of the whole field. We shall build a system here 
in width rather than in depth. We do not attempt therefore a par- 
ticular mathematical rigor. Rather than give definite results of cal- 
culations, we shall confine ourselves here to indicating how a given 
problem can be treated mathematically from a biophysical point of 
view, as well as point out the connections between the various prob- 
lems resulting from this point of view. In a series of subsequent 
publications, to which this paper is intended as an introduction, we 
shall elaborate each separate problem mathematically in detail. 

Logically the simplest special cases to be considered are those in 
which either only excitatory or only inhibitory nerves are present. 
Those two cases lead however to trivial results and are of little in- 
terest. 

The next case is when both excitatory and inhibitory nerve ele- 
ments are present, but when they are segregated in two sufficiently 
separated groups, so that the elements of one group are not appreci- 
ably affected by the substances produced by the elements of the other 
group. Stimulation of any inhibitory nerve-element will produce no 
response whatsoever. Stimulation of an excitatory nerve element will 
produce such a response. Two possibilities are included in this case. 


3.N. Rashevsky, Uber Hysterese-Ercheinungen in physikalisch-chemischen Sys- 
temen. Zsch. f. Phys., 53, 102-106, 1928; Uber den zeitlichen Verlauf der thermo- 
dynamischen Prozesse und die dadurch hervorgerufenen Hysterese-Erscheinungen, 
Zsch. f. Phys. 60, 237-242, 1930. 
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If all elements of the excitatory group are interconnected with each 
other, a stimulation of any single afferent element will result in a re- 
sponse of all efferent ones. If, however, due to structural and geomet- 
ric arrangements of the neuro-elements, each element is connected 
only with a limited number of others, then stimulation of a single 
afferent element results in a response of a limited group of effectors. 

More interesting possibilities are offered by the case in which 
some of the inhibitory elements are so close to some of the excitatory, 
that the substances produced by one group affect the elements of the 
other. Let for instance a group of afferent excitatory neuro-elements 
a. (Fig. 1) be transmitting the excitation to a group of efferent neuro- 


a, A ef 











FIGURE 1 


elements ef in the region A. If in the same region endings of afferent 
inhibitory neuro-elements a; are present, then various interesting 
phenomena are possible. 

Stimulation of a, alone will result in a production of the exci- 
tatory substance at A. The simplest law governing the production of 
this substance to be considered first is that the concentration e of 
the latter at A increases proportionally to the intensity of excitation 
I. of a, and decreases due to diffusion or metabolic destruction, pro- 
portional to itself. We thus have at A 





de 
Gi tee — ke , (1) 
which gives 
et (1 ety ; (2) 


taking as initial concentration e = 0, which will be the case in the 
absence of any preceding stimulation. The strength R of the re- 
sponse in ef will be in general a function of both J. and e, and, since 
I. and e are connected by (2), can be expressed in terms of J, only: 


R=f(l.,e) =fi(l) . (3) 


The stimulation of a; results in a production of the inhibitory sub- 
stance, for whose concentration i we have a similar expression: 
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di : 
Wi = bl; —mi ; (4) 
or 
; bI; ~mt 
eat, tin ateatied ): (5) 


The action of the two substances on the threshold of ef being an- 
tagonistic, the latter will in general be infinite when e/i < 1, and be 
finite for e/ > 1. If now a, has been continuously stimulated for some 
time, so that the asymptotic value of al./k of e has been nearly 
reached, and then a; is stimulated, then, as 7 increases, the response 
of ef is gradually inhibited. If al./k < bI;/m, then e/i will become 
smaller than 1, and a complete inhibition of e will result . If al./k > 
bI;,/m, R will be reduced, but not completely inhibited. 

A next complication in our scheme is brought by considering the 
case of an excitatory cross-connection between a, and a;, as shown by 
the dotted line on Fig. 1, so that stimulation of a; results also in stim- 
ulation of a,. Then if al./k <bI;/m, stimulation of a; will result in 
an excess of i at A, whether a, is stimulated or not. The response 
de —> ef > R is inhibited. If now, after stimulating a; for a while, so 
that 7 has approached its asymptotic value bI;/m, and e has ap- 
proached its asymptotic value al./k, stimulation of a; is interrupted, 
two things may happen, depending on whether k < m ork > m. In 
the former case, both e and 7 drop down to zero exponentially, but e 
decreases more slowly and therefore after some time i will drop be- 
low e (Fig. 2), which will result in an excitation of ef. We have a 














FIGURE 2 


“rebound phenomenon”, If k > m, e remains always smaller than i, 
and no rebound occurs. Considering an excitatory connection between 
a, and a;, such that stimulation of a. stimulates also ai, we find that 
it would result in no response at ef at all, when either Qe OY a; are 
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stimulated, if al./k < bI;/m. For the case of al./k > bI;/m on the 
contrary, stimulation of either a, or a; would result in a response 
of ef. 

Let us now consider the next complex case, namely that each 
afferent excitatory neuro-element sends off inhibitory collaterals to 
the centers of all others afferent excitatory elements, so that excita- 
tion of any afferent ending produces an inhibition of the central ends 
of all the other afferent elements. A simultaneous stimulation of sev- 
eral peripheral] afferent excitatory elements may in this case result 
in an inhibition of all corresponding central ends. Let the number of 
stimulated elements be vr. Then at the i-th center we have: 

de; 
we al, — ke; 
(6) 
a ’ 
dt . 


where 5’ means that the sum is to be taken over all the (m—1) ele- 
ments except the i-th element itself. From (6) we have: 


I; b I; 
a= (—eX) and p= em) 


If all centers are stimulated with the same intensity, so that J; — I, 
then the corresponding asymptotic values of e; and 7; are: 


I b(r—1)1 
isnt ‘ Siaace gets . (8) 
k m 
If 
b(r—1 
eee , (9) 
k m 


then e; < i; and the net result of the stimulation of all centers is their 
inhibition. If, however, one peripheral afferent neuro-element is stimu- 
lated stronger than the others, things will be different. Let J, > I. 
= J; — J, then the assymptotic values for e, and 7, are: 


= J,—--- 


al b(r—1)I 
é,—=— ie ; (10) 
k m 
and for e; and 7; (4 = 2, 3,---, 7) 
al bl, b(r—2)I 
Q=>=— — 5 i oe . (11) 
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Hence 
ee oe (12) 
k m 
This is positive when: 
i, OD) (13) 
I am 


For e; — 1; we have however: 


timetnwetha ee Wn <ft. 2 Ted, 
k m k m 


(14) 
on account of (9). 
That is, in spite of the stimulation of all r afferent neuro-ele- 
ments, only a, will excite its center, and this also only provided /, is 
sufficiently larger than J. 


III 


Let us consider now the intensity of excitation at a center in its 
dependence on the intensity of the peripheral stimulus. We use the 
word “center” in the usual neurological sense, denoting a group of 
neurones in the central nerve system, not directly connected to the 
periphery, but to which stimuli conducted by peripheral fibers are 
relayed through one or more intermediate neurones. Accepting the 
all-or-nothing law throughout, we must still consider the variation of 
the frequencies of the nerve impulses in a single fiber with increasing 
strength of the stimulus.‘ If h is the threshold of a fiber, then a stim- 
ulus of intensity S < h does not produce any response. But for S > h 
the fiber responds with a frequency », increasing with S — h. Since 
the frequency cannot increase indefinitely, the upper limit being set 
by the refractory phase of each impulse, the function 


y= v(S —h) (15) 
tends asymptotically to a constant value »,, when S — h increases, 
and is zero for S— h= 0. 

Let the thresholds of different fibers in a nerve track or trunk be 
distributed according to some distribution function N(h), giving the 


4Adrian, Mechanism of Nervous Action, Univ. of Pa. Press, Philadelphia, 
1932 


























N. RASHEVSKY 9 


numbers of fibers with threshold lying between h and h + dh. We 
have: 


[Na dh=N, (16) 


N being the total number of fibers in the tract or trunk. Different 
fibers will have for the same stimulus different frequencies. A stimu- 
lus of strength S stimulates not all 7 fibers, but only 


[xe dh . (17) 


The total intensity of excitation of the tract or of the correspond- 
ing center may be measured by the sum of the products of the num- 
ber of excited fibers by the frequency of each fiber. Since however 
different fibers have different frequencies, the total intensity is given 
by: 


_ [xe »(S—h) dh=F(S) . (18) 


This is the intensity of excitation at the central end of the tract. But 
the intensity of excitation of the center itself may in general be dif- 
ferent. If the excitable neurones, which constitute the center, are 
cross-connected in a complex way, some of the neurones of the center 
will be excited by two or more of the incoming fibers. For instance 
neurones 23 and 34 are connected each with two incoming fibers (Fig. 


13 
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FIGURE 3 


3). Some of such neurones will have sufficiently high thresholds, so 
that the excitation of one fiber alone, connected with it, does not pro- 
duce any excitation; whereas the excitation of both of the incoming 
fibers does result in an excitation. Thus for instance 23 may become 
excited only if both 2 and 3 are excited, and so on. If only 1 and 2 
are excited, the excitation under those conditions, is transmitted to 
1’, 2’ and 12. Excitation of 3 and 4 is transmitted to 3’, 4’ and 34. 
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But if 1, 2, 3 and 4 are excited simultaneously, then not only 1’, 2’, 12, 
3’, 4’ and 34 are excited, but also 13, 14, 23, 24. In other words, in- 
creasing the number of incoming excited fibers two times, we have 
increased the number of excited neurones of the center more than 
twice. This will hold for any other more complex arrangement of 
neurones, when each may be connected with more than two, etc. Each 
pair of “cross-connecting” neurones may be cross-connected by neu- 
rones of higher order, etc. If we therefore double the excitation I; of 
the tract, the excitation 7. of the center itself is more than doubled. 
Hence in a first approximation 


[~~ Al? , (19) 


where a > 1. a can be considered as a measure of the number of in- 
traconnections between the various neurones. If al. neurones are 
strictly “lined up”, forming a system of parallel linear chains with- 
out cross-connections, a is equal to 1. In general we have 


I-=f(I:) , (20) 


where f(x) is a function that increases more rapidly than «x. 
If therefore two tracts leading to the same center are excited 
with the corresponding intensities J, and J, we have at the center 


I.=(1,+1,)*>h+th . (21) 


IV 


The next generalization of this scheme is to consider the case, 
that every afferent neuro-element is connected anatomically with every 
efferent one, but that in general, while some connections have suffi- 
ciently low thresholds, others have not. Such an interconnection is pos- 
sible only if every afferent neruo-element branches off to several 
centers. Figure 4 shows schematically such a case for three afferent 
elements @,,, @,, and a,,, connected with the corresponding efferent 
groups ¢,, é and e, both through low threshold elements A and 
through high threshold elements B. The inhibitory neuro-elements, 
which each afferent center sends to all other centers, are not shown 
on the drawing. Let those inhibitory fibers lead only to the centers 
B, but none to the centers A. Stimulation of a,,, @, and @,, result 
in excitation of ¢,, e. and e, correspondingly. Stimulation of the three 
other afferent neuro-elements a,,, d-, and @-,» which are connected 
with ¢,, e. and e, only through high threshold elements B, does not 
produce any response. 
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b(r—1 
Let there be 7 elements a, and s elements a,. If iE ———_ 
a b(s—1) ae ; : 
as well as E < ees then excitation of either a single a, ele- 


ment or a single a, group will result only in inhibitions of all corre- 
sponding centers. Consider however the case that we excite simul- 
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FIGURE 4 


taneously a,; and a,;. Let the intensity of excitation at the central end 
of any tract a, be J,, the intensity at the central end of any tract 
a, be I,. For simplicity we consider all J, equal amongst themselves, 
as well as all J,. But J. 2 I,. If now a,; and a,; are excited simul- 
taneously, we have at the center B;;: 


dei; ” 
— a(I,+ I,) —ke, 


dij; re 
<= b(r—1) 1 + b(s 





1) I, — mi ? 
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or, for the asymptotic values: 


a(Ie+ Iu)? | , © ™ - b ‘ 
a wel : mpc tr 8) I, +o ee I,? . 


(23) 


For every center B;,(p 2 J): 


al ,¢ 
k 





— . ree, a 2 a 
; tip (le + Iu) 3) I, a alin I,? 
(24) 


and for any B,;(q 2 7): 





alt b an re. : 
eg = k pgp ee (he La) * + — (2) I, += (s—1) le . 
(25) 
Hence 
‘5 hpeuce: €2 I,)¢ b (r—1) [2 b (s—1) I,° 
ot a ame et u Mm c a ara u : 
(26) 
Since (J, + I,)* > I, + I, therefore 
(U.+1,)?=12°+1,7+6 (27) 
6>0. (28) 
The greater 6, the greater a, and 6 is equal to zero for a = 1. 
If 
oe a : b a oa 
Er ky re BE Le tt 8) ow ad >0, 
. (29) 
then 
Cij —1i; eo. (30) 


The right hand side of (29) is always positive, because of (9) and 
(9’). For ei, — tip we have: 


al, b b b 
A eh Ae ae Es _ae of aie cae me | hg 
ei» — ty = — (le + Lu) — (rv — 1) —— (8 — 2h <0, 
as well as (31) 
al 2 b 





Cgj — tqj5 = i (I, a I,)¢ —S-(r—2)fe—2 eI <0. 


m 
Hence simultaneous stimulation of a,; and a,; will result in an exci- 
tation of the center B;;, and in inhibition of all other B’s. 

The original threshold of the efferent centers being in general a 
function of both J, and e — i, let N(J.,e —i) & dI,d(e —i) denote 
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the number of efferent neurones just stimulated when /, is between 
I, and I, + dI, and e — i between (e — 7) and (e —7) + d(e —2). 
We have 


fu [ue—a N=, (32) 
0 0 


the total number of the efferent neurones in the region considered. 
Let N.(t) be the number of neurones which, when stimulated, 

require a duration z of the stimulation in order to lower their original 

threshold so far that a stimulation a,; alone would produce a response. 


[eo — os (33) 
0 


Let us first consider the case that the average of t is much larger than 
either 3 or = . In that case we may neglect the initial variation of 
e — i from zero to the asymptotic value, and consider with sufficient 
approximation that e — i has from the beginning the constant value 
given by (26). Then the number of neurones stimulated by e — i and 
I, is equal to: 


N’ (Ie @— i) = fat fL ae—an. (34) 


Of those N’N.(z) will have lowered their threshold permanently in 
a timez. Hence: 


N’ [mo dim N’ (35) 


gives the total number of neurones of B that have a “low threshold” 
after the time t. 

If now a,; is stimulated alone, and if N;(7;, i — e) is the distribu- 
tion function of the time 7;, which it takes 7 to act on a nerve in order 
to raise its threshold temporarily to infinity, then 


[ ai—e) [N@i—e) a ee 


will give the number of neurones inhibited completely after the time 
t. 

For t = 0, that is immediately after stimulation of a.;, N = 0 
and therefore, according to (35), a response will be produced through 
N* neurones, which had their threshold lowered before. The strength 
of this response is given by N**. This can be called a conditioned re- 
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flex, for, previous to the stimulation simultaneously of a,; and a-;, stim- 
ulation of a,; did not produce any response at all. (35) gives us the 
strength of the response as a function of the time of duration of the 
simultaneous stimulation of a,; and a.;. If this simultaneous stimula- 
tion is done intermittently at regular intervals, then the time as giv- 
en by (35) is equal to the total number of stimulations n, times the 
duration # of each stimulus 


taxa, (37) 


In general the lowering of a threshold of a neurone may be a re- 
versible process, so that in the absence of excitation the threshold 
again spontaneously rises. This will affect equation (37) in such a 
way that the actual time will be greater than nd, the difference in- 
creasing with increasing intervals between the stimulations, because 
during those intervals the thresholds will have risen. Accepting some 
definite law of “recovery” for the threshold, we may in this way de- 
velop a general theory of conditioning by intermittent stimulation. 


The case that z is comparable with or a is more complex, and 


shall be treated elsewhere. . 
When a,; is stimulated alone after conditioning, then the condi- 


tioned response has the intensity N*¢ only at the very beginning, while 
N is close to zero. As the stimulation of a,; alone is continued, N in- 
creases according to (36) and the conditioned response is decreasing 
according to 


N.= [N*—N(i—e,t)]*. (38) 


A stimulation of a,; alone would also result in an inhibition of the 
center B;;. A different case is obtained if we consider that the mu- 
tual inhibition of the various branches of a, is less than that of vari- 
ous branches of a,, which can be brought about by a difference in the 
constants b; thus b, < b.. Then we may have 


b(s—1) a be(r—1) © sis 
m m 


al 


A calculation like the one before shows now that, while stimulation 
of any a.; produces inhibition in all B,;, stimulation of a,; results in 
an excitation of all B;,. In particular, the common center B;; will be 
excited every time that a,; is stimulated. This will result in general 
in rendering B;; conducting without the simultaneous stimulation of 
a.;. If however the threshold of B;; has a lower limit h, and if 

















N. RASHEVSKY 15 





. . b b ;’ 
Cip — lip = ween et Fu) mee Shoe 5 I. 


ee: (s—2)1,¢-<h, ; (40) 
m 
then stimulation of a,; alone will not stimulate the “high-threshold” 
neurones of B;,. Simultaneous stimulation of a,; and a; result how- 
ever in an increase of e;; — 71;;, which instead of being given by (40) 
is now given by (26). For a sufficiently large 6, we shall have now 


€ij — ij Ry , (41) 


so that a simultaneous stimulation of a,; and ai; stimulates B’;; and 
results in a conditioning of a,.;. The conditioned response is not in- 
hibited by stimulation of a,; alone, but is inhibited by stimulation of 
a,; alone. 

If some afferent neuro-elements send off inhibitory fibers not 
only towards the centers of other afferent elements, but to the peri- 
pheral ends of other inhibitory fibers, then stimulation of those affer- 
ent elements will result in an inhibition of the inhibitory fibers and 
in a “disinhibition” of inhibited reactions. Considerations of sections 
III and IV show that the interaction of excitatory and inhibitory 
nerves results in a sort of concentration of the excitation in the re- 
gion which is excited strongest. Such effects may be produced also 
by a number of different mechanisms. Consider for instance the the- 
oretically possible case (which incidentally apparently happens to 
actually exist in reality) that the excitation of a region produces a 
dilatation of blood vessels in that region, thus increasing the blood 
supply. This will be connected with a “draining” of blood from ad- 
jacent regions. If the increased blood supply in its turn increases the 
excitability of the region, we shall have again a concentration effect. 
Temporary increase of a blood supply may produce lasting changes 
in the nerve tissue, and a mathematical theory of conditioning based 
on this picture, and presenting some rather interesting features, may 
thus be developed. We must postpone the presentation of those re- 
sults to a next publication. 


V 


Consider now the more general case that a stimulus of a pe- 
riphal nerve sets forth a number of reactions, which in their turn 
set forth new stimuli, and so on. Since for each component of this 
chain of reactions a finite time is required, a single short stimulus 
will produce a relatively slowly varying process in the brain, which 
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will be in general propagated from one region to another with a finite 
speed. We might consider a still more complex case, namely the exist- 
ence of a special center, in which a short peripheral] stimulus releases 
a slow reaction. In either case we shall have a propagation of some 
physico-chemical] process, sweeping over a number of B centers, and 
at a given moment ¢ those centers B will be excited most strongly 
which are just reached by the peak of the “wave.” The excitation in 
the centers reached previously will be already somewhat attenuated, 
the attenuation being the stronger the larger ¢t — t,, t, denoting the 
time at which the previous center was just reached. 

If a, is stimulated at the moment ft, a, having been stimulated at 
the moment ¢t — 0, then, according to (34) and (35), the strongest con- 
ditioned response to a, will be through those B-centers which have 
a maximum excitation at the moment ¢, since N* increases with e — i 
and /,.. Therefore if a, is stimulated ¢t seconds after a, during the 
process of conditioning, then the maximum response to a, alone will 
occur t seconds after its stimulation, because it occurs through B-cen- 
ters, which is reached by the excitation from a, a time of t seconds 
after the stimulation of a. (delayed reflex). 

A particularly interesting result is obtained if we consider the 
case that the “high-threshold” elements on the efferent side of each 
center B also send inhibitory fibers to the other B centers. Then, as 
some of those “high-threshold’” neurones becomes “low-threshold” 
and are stimulated, this results in an increased inhibition of the oth- 
ers centers. If there are p strongly excited centers and q weakly ex- 
cited ones, we have for the former: 


al b(p—1)I bal’ 





(e—1) —— m m _ 
and for the latter: 
le snity tun. bel b(g—1)l" (43) 


k m m 


I being the intension of the strong excitation, I’ of the weak one, 
Il’ < 1. If I’/T is sufficiently small, then (e — 7) > 0, but (e — 2)’ 
< 0. In other words, the weaker excited center will become inhibited 
completely. If the attenuation of the excitation, after the passage of 
the excitation wave, takes place sufficiently rapidly, in other words, 
the excitation wave is sufficiently steep, then all centers, which are 
reached at times ¢, for which t — ¢, is sufficiently large, will become 
inhibited, as the center B, reached by the excitation at the moment ¢, 
becomes completely conducting. While at first a stimulation of a, will 
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result in a reflex, which will begin immediately after a., but have its 
strongest intensity at ¢, gradually only the reflex at ¢ will subsist. 

A center B reached by the excitation at the moment ¢, would 
have been conditioned to the extent given by equations (34) and (35) 
in which the value of (e — 7) at the point B, at the time ¢ has to be 
used for e —i. As the conditioning of the B centers proceeds, i at B, 
will increase, (e—) decrease. If in (34) and (35) we consider that 
e—i is not a constant, but a monotonously decreasing function of 
time (e—+) (¢), then we can calculate the amount N* of “low thresh- 
old” neurones at any time ¢ in the following manner. At the time f 


there are: 


(e-+) (t) 
f Ny (h) dh = N*(t) (44) 
0 
excited neurones. Of those 
(e-7) (t) 
N, (t) dt { N(h)dh (45) 
" 
will just become “low threshold” between ¢ and t + dt. Hence 
t (e-7) (t) 
N,*(t) = [Neate f N(h)dh (46) 
0 0 


is the number of low threshold neurones at the moment ¢ - N*(t) 
is a known functional of the function (e—) (¢), 


N,*(t) = Fi[(e—7) (t)] . (47) 


This holds for values of t < t*, where ¢* denotes the root of the equa- 
tion 


(e—1i)(t)=0. (48) 
For t > t*, e —i < 0 and the situation is described by (38). Hence 
N*(t) = N*(t*) —N(t) (49) 
with 
W(t) = [Mite—o (t) Jat (50) 
0 


N; being the same as in (36). 
Actually (e—i) (t) is not given, but itself is determined by the 


rate of increase of N,* at the B-center, for i at B, increases as N,* in- 
creases. So that 
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N,*(t) = F,|N,"(t) | (51) 
as well as 
N,*(t) =F,|N,*(t)| . (52) 


The functional equations (51) and (52) determine N,*(t) and 
N,*(t). It can be shown in general, that while N,* increases mvunoto- 
ously, approaching the assymptotic value determined by 


f Nucyan [Moa ; (53) 


N,* will reach a maximum N,,,* after a time t = t,,, and then drop to 
zero, since (e—i)’ becomes and remains negative, according to (42) 
and (43). Details must be given elsewhere. 

The formal general consideration, which we have made in a pre- 
vious paper’ in regard to the general necessity of stimulating a,; be- 
fore a,;, if we consider the spreading of the excitation wave, can be 
applied, as will be easily seen by the readers, to the present case with- 
out any modifications. 


VI 
If two different stimulus patterns are produced 


A om (Qi; Dery Uety Vem 
and 
A; oeae (Qep; Qea; cis Wer) ’ 


then, due to the non-additivity expressed by (19), in general different 
combinations of B centers are excited each time. If A is conditioned 
to a response R, and then A, is stimulated, then the stimulation of A, 
results in a stimulation of N,” neurones of the B centers, which in 
general are different from the N’ neurones stimulated by A. They 
may however have some neurones N’” in common. A, will therefore 
produce also a response R, via the common N” neurones. A repetition 
of A, without reinforcement will result in an inhibition of the N” 
neurones including the N’”” common ones. Stimulation of A will pro- 
duce now a response via N’ — N””’ neurones characteristic of A only. 
A differentiation between A and A, is thus obtained. (Cf. loc. cit.*). 

We may consider the ratio o = 2N’’/(N’ + N”) as a measure of 
the difference between the two patterns A and A,. For totally differ- 
ent pattern N’”” is zero, o = 0. 


5.N. Rashevsky, Phil. of Sc. 1, 409, 19384. 
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VII 


We can now consider a more general case of an organism, sub- 
ject to a set of complex unconditioned stimuli, which can be divided 
into two groups. One group is such that the unconditioned reflex R, 
produced by it results in a stimulus s,’, which produces as an uncon- 


ditioned reflex the opposite of R,, which we shall denote by R,. The 
other group of stimuli has no such characteristic. Consider a case of 
the first group. Let s, produce R,, and this result after a time ¢ in 
s,’, which produces R,. Then s, plays the role of a conditioned stimu- 
lus towards the delayed unconditioned s,’, and s, will result in a con- 


ditioned delayed reflex s, — R,, which will have its strongest inten- 
sity t seconds after s,. As we have seen however in the initial stages 
of conditioning, the delayed reflex will in general be produced also 
at times  < t, only with a weaker intensity, which decreases with 
t — t’. In particular there will be a reflex (s, — R,) (0) att = 0, 
that is immediately after the stimulus s,. If the intensity of (s, — 
R,) (0), which we shall denote by J,, is larger than the intensity [,, 
of (s, — R,), then s,—R, will not be produced. The stimuli of the 
first group thus become gradually inefficient. The time which it takes 
to make (s,—R,) (0) stronger than s, — R, is determined by (51) 
and (52), and is proportional to the number of repetitions necessary, 
according to (37). If the functions left indefinite in (44-52) are 
given, we can explicitly calculate the number of repetitions necessary 
to eliminate the “wrong” act. In order that it should be possible at 
all, it is necessary that the maximum intensity J,,, of J, at B(0) 
should be greater than J,. And since I/,, is a decreasing function of 
t — t’, it follows, that when t becomes very large I,, > I, will be im- 
possible, and s,—R, will not be inhibited. A situation of such a na- 
ture is found in some trial-and-error problems, in which a wrong 
reaction results in an opposite one; for instance, reaching for a 
wrong thing results in a retraction of the hand; or going into a 
wrong passage in a maze results in retracing the steps, etc. For a 
case of a maze it follows that, under the theoretical conditions studied 
in this paper, if the incorrectness of an attempt becomes apparent 
only after a very long time, the incorrect action will take many repe- 
titions to be eliminated. When this time is too long, no elimination 
will take place. 

We may also consider a more complex case, namely, that a wrong 
act produces a stimulus which excites a special center whose activ- 
ity inhibits all others (pain center*), while a correct act is one 


6N. Rashevsky, Jl. Gen. Psych. 13, 208, 1935. 
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which excites a center that produces a general excitation (pleas- 
ure). The inhibition will be formed as a delayed reflex to s,, and if 
at t= 0 it is already strong enough, it will eliminate R,. A success- 
ful act will on the contrary be enhanced. Again the necessary num- 
ber of repetitions is determined by (51) and (52). 


Vill 


L. L. Thurstone’ has developed a rational theory of learning 
curves, in which the average number of repetitions necessary to 


ee 1 
eliminate a wrong act enters as a parameter of the curve E° We are 


now in a position to express in principle k through the physical con- 
stants of the brain by using (52) and (51). 

Thurstone makes the very interesting remark that, if k > 1, it 
means a presence of rational thinking. For a subject by eliminating 
one erroneous possibility at the same time eliminates several other 
possibilities, without trying them out. Our theory enables us to in- 
terpret the case of k > 1 in physico-physiological terms. 

In order that the elimination of one possibility should result in 
a partial or total] elimination of another, there must be some similar- 
ity between the corresponding two stimuli, or in logical terms, there 
must be some common criteria to the two situations. If all situations 
to be tried out have no such common criterion, then, no matter how 
ingenious the person will be, can never eliminate a case by reason- 
ing based on previous experience. Thus if a number of playing cards 
is scattered face down at random on a table and it is required to pick 
up say all the spades, keeping the right card and returning the wrong 
one again on the table, then, no matter how ingenious a person is, the 
elimination of one card does not increase his knowledge of the others. 
If, however, the spades are arranged all in the same direction, then, 
with a sufficient amount of intelligence, the person may pick up all 
correct cards after only one or two trials. Biophysically speaking all 
this means that, in order to make any reasoning or insight possible, 
several different situations characterized by different stimulus-pat- 
terns must have some elements of those patterns in common. If the 
organism has eliminated after a repetition a pattern A,, then, if the 
pattern A, has a part A,. in common with A,, the response to the 
pattern A, will be weakened by the elimination of A,. If N, is the 
number of central neurones involved in A,, N. the number of neurones 
involved in A., and if N,,. is the number of neurones common to both 


7 Jl. Gen. Psych. 2, 469, 1930. 
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patterns, then, after elimination of A,, the response to A, is produced 
only through N. — N,. neurones. Inasmuch as the response increases 
with the number of neurones involved, the response to A, is now 
weakened. If there were a direct proportionality between the number 
of neurones and the intensity of the response, the response to Az 


1 





would now have been only (1 — =) of its initial intensity. In general 











N, 
however; instead of the factor (1 — N- ) we shall have some func- 
2 
° ‘ ING . . : : : Ni2 
tion of it, F(1— W ), which in the first approximation is (1— W )8, 


where f >1 is closely related to a of section III, though not necessarily 
the same as a. § however also measures the amount of “cross-connec- 
tions” between the neurones. When one possibility is eliminated by 
trial, then we may say that a fraction 


Ni. 


er tae (hn )8—1— #8 





of another possibility is also eliminated. If out of the total number 
of wrong possibilities there is a group of 7 possibilities wtih the same 


#, in other words with the same = , then the elimination of the first 


possibility by trial results in an elimination of (n—1) » = (n—1) 
(1 — #) other possibilities. 
8, as we just said, is determined by the structure of the brain. 
N12 
= (1 — N 
pattern, in other words, on the nature of the problem, and on the 
12 
N, 
excitation patterns in the center will in general have the more com- 
mon neurones, the more common neurones are involved in the peri- 
pheral excitation patterns. But there is no direct proportionality be- 
tween those, due to the existence of cross-connections. Considering 
again Fig. 3, let (1,2,3) be one peripheral pattern, and (2,3,4) an- 
other. They both have neurones 3 and 2 in common. We have N,? = 
8, N,? = 3, Ni? = 2, ve =: In the center the same pattern in- 
volves neurones (1’, 12, 2’, 23, 3’,13) and (2’, 23, 3’, 34, 4’, 24) corre- 
spondingly, and the common neurones are 2’, 23 and 3’. We have at 


Nie So N,2” . 
the center N, = 6, N. = 6, Ni. = 3, —— = This result 





) depends however both on the nature of the stimulus 


refers to the neurones in the centers. Two 





structure of the brain. 


N2 6 < N.? : 
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holds quite generally for a given peripheral _ » the corresponding 
2 





ae 
central will be the smaller, the more there are cross-connec- 








N2 
tions, in other words, the larger §. Furthermore the decrease of 
Nie ; ’ ' 
W at the center as compared with the periphery will be the larger, 


the greater N, and N., the total number of neurones involved, be- 
cause the ratio of the number of cross-connections to the total num- 
ber of neurones increases with increasing this total number. If we 
consider only cross-connections of first order we see that their num- 
ber N* is equal to the number of combinations of N elements taken 
2 at a time, that is 


™ N! 
~ 91(N—2)!_ 


Hence 
N* (N — 1)! N—1 
_ : (54) 


N 2uN—2)! 2 
Considering cross-connections of higher order, the above statement 





12 


holds a fortiori. Hence we see that at the centers af is the smaller, 
3 





the larger the total number of elements involved, or the more complex 
the peripheral pattern. Thus # is itself a function of both # and of 
the particular configuration of the peripheral stimulus-pattern. 

There is however a second factor, determined by the structure 
of the brain, which enters into 3. Two peripheral stimulus patterns 
may have no elements at all in common, yet the corresponding central 
pattern may not only have common elements, but even be entirely 
identical. Such cases we have studied previously in discussing the 
physico-mathematical aspects of Gestalt-transposition’. In this case 
some mechanism of the nature described in the above referred paper 
is involved. This mechanism is characterized by special arrangements 
and properties of a large group of neurones, and is intercalated be- 
tween the periphery and the B-centers, in which conditioning occurs. 
Let us call the neurones, belonging to this mechanism, g-neurones, 
their total number being N,. 

If every peripheral receptor is connected with the corresponding 
B-center only through g-neurones, then a number of peripheral stimu- 
lus patterns, which have no actual neural elements in common, but 
which are characterized by internal physical or geometrical similar- 
ity, would produce identical central patterns. The elimination of one 
such peripheral pattern would result in the elimination of all others 
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of a similar nature. For instance elimination of a square will elimi- 
nate all squares, regardless of size, position, color, etc. 

If however in general, besides g-neurones, there are direct con- 
nections involving Nz neurones between periphery and B-centers, then 
two geometrically similar peripheral patterns will not produce iden- 
tical central patterns. Those latter, however, will have a number of 
elements in common, and this number will be the greater, the larger 
N, 
Ni 
there may be common neurones amongst the N,. Hence 3 depends also 


the ratio —, though again there is no direct proportionality, because 


N 
Shes being the smaller, the larger N,/N.- 
d 


If the elimination by trial of one possibility requires y repeti- 
tions, then y repetitions actually eliminate 1 + (n—1) (1 — #®) pos- 
sibilities. The average number of repetitions per possibility, which is 


Thurstone’s - is equal to 


1 ? 
bE 1+ @+1)G—¥F) ' 





or 
1+ (n+1) (1— 8) 

-_ y 

y is always either larger than or equal to 1. Therefore : <1. Butif 


k (55) 





(nr — 1) (1 — §) is sufficiently large, then k may be greater than 
one. k is a function of §, as well as of N,/Na, which characterizes the 
structure of the brain. But it also is a function of the character of 
the problem. For stimulus pattern so chosen that N,. — 0, ? — 1, 
we always shall have k < 1. The dependence of * on # is a double one 
due to the fact that # is itself a function of #. This latter function 
depends on the assumptions which we make about the distribution 
and number of cross-connections. For any definite assumptions about 
this distribution, 3(8, N,/Nz) can be calculated explicitly, and hence 
also k can be given in terms of the structural constants of the brain. 

Formulae, similar to (55), can be derived for the more general 
case that all the possibilities consist of several unequal groups, with 
the same similarities within each group, etc. 

All the above is approximate, inasmuch as we do not consider 
that k varies with time, as the learning proceeds because the first 
trial eliminates a larger fraction than the second, and so on. This is 
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because the conditioning curve tends asymptotically to its maximum 
value. 

Similar considerations may be applied to a reinforcement of the 
correct response, which is in general characterized by a different con- 
stant.® 


IX 


The case of a purely rational solution of a problem may be 
schematically represented in the following way. Let the past history 
of the individual be such that whenever he reacted by the reaction 
R to a stimulus pattern A,(q;, a, a;), whether presented alone or in 
combination with other elements, R resulted in a stimulation of a gen- 
eral excitatory center P (pleasure center*). Therefore A, becomes 
conditioned to P, so that A, results in a stimulation of the latter. Let 
the individual meet several situations A., A;, etc. Let one of them 
say Aj(Qi, G,, M%, Gs, Mp, @) contain amongst its elements the ele- 
ments @;, x, @;, Which constitute A, and let the others not contain any 
of those elements. We say that A; “contains” A,. We may say that 
the individual knows that amongst a number of situations that which 
“contains” A, is the correct one (the one to which a response is a 
pleasurable one). 

Let in general any neutral stimulus be permanently connected 
w:.1 a general inhibitory center D. Any stimulus not containing 
Qi, Gx, a, excites D and produces no reaction. But let now A; be pre- 
sented. A, and A; have common neurones in the center. The neu- 
rones common to A, and A; are those conditioned to P and therefore 
by stimulating P tend to produce a reaction of the organism to Aj. 
The other neurones of A;, not being conditioned to P, excite only D, 
which inhibits the reaction. In order that the individual would re- 
act to Aj, the excitation of P must exceed that of D. The ratio of the 
intensities of the two excitations depend on the structural character- 
istics of the corresponding centers. But besides that it depends on 
the ratio of N,;/N;i. If N,i/N; is zero, only D is excited. Since we 
have seen that N,;/N; decreases as N;, which characterizes the com- 
plexity of A;, increases, we see that for a sufficiently complex A; the 
excitation of D prevails. The organism does not react at all. For a 
small enough complexity, P prevails and a reaction is produced to 
A;. But no reaction will be produced to any other pattern which does 
not contain A,. Thus the correct pattern is chosen by a purely cere- 
bral process, not involving any overt trial and elimination. Making 
definite assumptions about the distribution of cross-connections, we 


8 Gulliksen, H., Jour. Gen. Psychol. 11, 395, 1934. 
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can, as we have seen, calculate the central N,;/N; from the peripheral 
N,;’/Ni”, which is a function of the complexity M of the peripheral 
stimulus pattern. As a measure of M we can take some conveniently 
chosen monotonously increasing function of N;. If the intensity of 
excitation of P is proportional to N,;, that of D to N; — N,i, then the 
requirement of the possibility of a purely rational solution of the 
problem amounts to 


AN, > B(Ni—N:ii) , (56) 
where A and B are constants, again depending on the structure of the 


brain (number of connecting neurones, etc.) (56) can be written 


Nii + Ba 


A 
N; N; 





) 





or 

Nii B 
N,i/N; being a decreasing function of the complexity M of the prob- 
lem. 

Since N,i/N; = F(M) 
we have 

F(M) > ein , (58) 

A+B 

which gives the upper limit of the complexity M of a problem which 
an individual can solve by pure reasoning. This upper limit is ex- 
pressed in terms of constants, which characterize the structure of 
the individual’s brain. When M exceeds the critical limit, a rational 
solution becomes impossible, but the possibility of a solution by trial 
and error remains. Further details and explicit calculations of the 
constants involved in (58) from specific assumptions about the ar- 
rangements of neurones will be given elsewhere. 

Here we shall only remark that the reaction time to the “cor- 
rect” pattern also depends on its complexity M, increasing with the 
latter. It takes longer “thinking” to solve a complex problem. 

An explicit calculation of the constants leads also to the expres- 
sion of various directly measurable quantities, such as speed of con- 
ditioning, of external inhibition, of solving a given problem either by 
trial or by reasoning, in terms of the physico-chemical and geomet- 
rical constants characterizing the hypothetical structure of the brain, 
which is taken as a basis of the explicit calculations. For a given 
theoretical picture we know the number of independent constants 
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which characterize the system. Hence we can express any of the above 
mentioned quantities as a function of those independent constants 
Uj (Gy, Qe, +++, x) « 
In the first approximation U; are linear in a,, with known co- 
efficients a;; 
U; — b> Aiss - 


Considering the general case that those constants a, vary from 
individual to individual, we may calculate the correlation coefficients 
between any pair of quantities U;, U;, by considering the a, as coordi- 
nates in a hyperspace, and U; as vectors with component a. 


The University of Chicago, 
Chicago, Ill. 
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SIMPLIFIED CALCULATION OF PRINCIPAL COMPONENTS 


HAROLD HOTELLING 


The resolution of a set of tests or other variates into compo- 
nents y,, each of which accounts for the greatest possible portion y,, 
yo, ++ , Of the total variance of the tests unaccounted for by the pre- 
vious components, has been dealt with by the author in a previous 
paper (2). Such “factors,” on account of their analogy with the prin- 
cipal axes of a quadric, have been called principal components. The 
present paper describes a modification of the iterative scheme of cal- 
culating principal components there presented, in a fashion that ma- 
terially accelerates convergence. The application of the iterative pro- 
cess is not confined to statistics, but may be used to obtain the mag- 
nitudes and orientations of the principal axes of a quadric or hyper- 
quadric in a manner which will ordinarily be far less laborious than 
those given in books on geometry. This is true whether the quadrics 
are ellipsoids or hyperboloids; the proof of convergence given in an 
earlier paper is applicable to all kinds of central quadrics. For hyper- 
boloids some of the roots k; of the characteristic equation would be 
negative, while for ellipsoids all are positive. If in a statistical prob- 
lem some of the roots should come out negative, this would indicate 
either an error in calculation, or that, if correlations corrected for 
attenuation had been used, the same type of inconsistency had crept 
in that sometimes causes such correlations to exceed unity. 

Another method of calculating principal components has been 
discovered by Professor Truman L. Kelley, which involves less labor 
than the original iterative method, at least in the examples to which 
he has applied it (5). How it would compare with the present accel- 
erated method is not clear, except that some experience at Columbia 
University has suggested that the method here set forth is the more 
efficient. It is possible that Kelley’s method is more suitable when all 
the characteristic roots are desired, but not the corresponding cor- 
relations of the variates with the components. The present method 
seems to the computers who have tried both to be superior when the 
components themselves, as well as their contributions to the total var- 
iance, are to be specified. The advantage of the present method is en- 
hanced when, as will often be the case in dealing with numerous vari- 
ates, not all the characteristic roots but only a few of the largest 


are required. 
~~ om 
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Iterative processes of various kinds are capable of acceleration 
by means of the matrix-squaring device here used. In particular, the 
simultaneous determination of a linear function of a set of variates, 
and of another linear function of another set, such that the correla- 
tion of these two functions is a maximum, may be facilitated in this 
way. This problem of the most predictable criterion and the best pre- 
dicter has been discussed briefly by the author, (3) and will be 
treated more fully in a forthcoming paper. 

Let 7;; be the covariance of the ith and jth of a set of n variates 
“1, +++: , Xn; then if units have been chosen such that each standard 
deviation is unity, each 7;; = 1, and the 7;;’s are the correlations. If 
we take any arbitrary set of numbers a,, --- , a, and substitute in the 
formula , 

af=Sr,, (t=—1,2,---,2) . (1) 
j=1 
The new set of numbers 4@,;’, --- , @,’ will be proportional to the old if, 
and only if, they are also proportional to the correlations of one of 
the principal components with the original variates. These correla- 
tions are also the coefficients of the particular principal component 
in the equations which gives the z’s in terms of the y’s. If ai’ = kai, 
for each 2, then k is the sum of the squares of the correlations of the 
v’s with the particular y. 

If the a;’ are not proportional to the a;, they may be substituted 
in the right-hand members of (1), and will then give rise to another 
set of values a;”, --- , Gn”, such that 


Om” = >> Timi,’ . (2) 


If the new quantities are treated in the same way, and this pro- 
cess is repeated a sufficient number of times, the ratios among the 
quantities obtained will eventually become and remain arbitrarily 
close to those among the coefficients of one of the y’s. This was dem- 
onstrated in the fourth section of the previous paper on principal 
components. The component thus specified in the limit will, apart 
from a set of cases of probability zero, be that having the greatest 
sum of squares of correlations with the x’s. This sum will equal the 
lirait k; of the ratio of any one of the trial values to the correspond- 
ing one in the previous set. 

Now if we substitute (1) in (2), and define 


Cnj = z Timi’ ij » (3) 
4 


we shall have 


a — - Cm jj; . (4) 
] 
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Consequently if we first calculate the quantities c,,;, we may use (4) 
instead of (1), and then each iteration is precisely equivalent to two 
iterations with the original correlations. Thus the number required 
for any given degree of accuracy is cut in half. 

Let R denote the matrix of the covariances 7;;. Then from (3), 
Cmj is the element in the mth row and jth column of the symmetrical 
matrix R?. Substitution of a set of trial values in (1) is equivalent to 
multiplying it by the rows of R, while substitution in (4) amounts to 
multiplication by the rows of R?. 

But we need not stop with this improvement. Having doubled 
the speed of convergence by: squaring R, we can double it again by 
squaring R*. If we square a third time we have a matrix RF’, by which 
a multiplication is equivalent to eight multiplications by the original 
matrix, and so forth. We can square as many times as we like; if we 
square s times successively and denote 2° by t, we obtain R', with 
which one step of the iterative process is equivalent to ¢ steps of the 
process with the original matrix. The only limit to this acceleration 
is reached when the convergence is so rapia that an additional squar- 
ing of the matrix is not worth while. 

The ultimate ratio of consecutive values, such as @;’/a,, was k, in 
the original process. In the accelerated process, using R‘, this ratio 
is k,‘. Instead of extracting the tth root to find k,, it is better to make 
a final multiplication of the trial values by the rows of R itself, and so 
upon division to find k,. This saves labor and also provides a final 
check upon the calculations, including the squaring of the matrices. 

An additional check upon the squaring operations may be accom- 
plished by carrying along an extra column as in the method of least 
squares. Each entry in this check column is the sum of those preced- 
ing it in the same row. The check column is multiplied by each row 
of the matrix to obtain the check column for the square of the matrix. 
This check is not so essential as in the method of least squares, in 
view of the final substitution just mentioned, and since the calcula- 
tions are so simple that an experienced computer with a good machine 
is not likely to make a mistake. However, for an ordinary computer, 
especially if the variates are numerous and the squaring is repeated 
several times, there is likely to be an eventual saving of labor if this 
check is made at each step. 

In the determination of the second and later principal compo- 
nents by this method, the convergence may be accelerated in the same 
manner by the use of the tth power of the matrix of the reduced co- 
variances. However there is a further saving of labor here if we form 
this power, not directly as in the case of R' by repeated squarings, 
but with the help of the determination already made by R‘, and the 
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following results obtained with the help of 'the algebra of matrices. 


(Bocher, 1907, 1921). 
Putting C, for the matrix in which the element in the ith row 


and jth column is @;:@;; (7, 7, = 1, 2,--- , n), the matrix of the reduced 
covariances used in finding the second principal component is 
R,=R—C, . 


From relations established in the former memoir (2) (p. 424, equa- 
tion (16), and p. 425) we have that 

= TimOn = k,Qm1 ’ 

h 


and 


These lead to the matrix relations 
RC, == CR = kC, ’ 


and 
C;? oat k.C, . 


From these it is easy to show for any integer n that 
R"C, = RC," a k,*C, ’ 


C,* = bC. . 
Hence we readily obtain 


R,? = R2 — 2RC, = C,? es R? —- k.C., ’ 


and in general 
R,' = R‘ — k,*C, ° (f= 2°) 


The partial cancellation of the middle by the last term in the squar- 
ing is strikingly reminiscent of some of the formulae in the method 
of least squares, with which the method of principal components pre- 
sents many analogies. 

From the last matrix equation we derive the following simplified 
method of obtaining numerical values of the desired power of the 
reduced matrix: 

Having determined k,' as the ratio of consecutive trial values 
with the matrix R‘, and k, as the ratio of consecutive trial values 
with R, find k,' by division. Multiply this by each of the quantities 
Qj,0;,(1, 7 = 1,---, n) and subtract the products from the correspond- 
ing elements of R‘ to obtain the elements of R,'. 

The elements of R, themselves are found as in the former paper, 
i. e., by subtracting a;,a;, from the corresponding elements of R. The 
second principal component is found from R, and R,' in exactly the 

















a ne 





HAROLD HOTELLING 31 


same manner as the first component from R and R‘. To obtain the 
matrices R, and R,' from which the third principal component is to 
be found, the elements of R, and RF,‘ are diminished respectively by 
k.Wiodj. and by k,'ai20j;2; and similarly for the later components, if 
enough is left of the aggregate variance to make these worth com- 
puting. 

If we subtract & from each element of the principal diagonal of 
R the resulting determinant may be called f(k). Now multiply f(k) 
by f(—*), rows by rows. The resulting determinant is identical with 
that obtained from the matrix R? by subtracting k? from each element 
of the principal diagonal. But if, in the equation f(k) f(—k) — 0, 
we substitute k? — x, we obtain an equation of degree n in x whose 
roots are the squares of those of f(k) —0. This fact shows that not 
only the greatest root but all the roots of the characteristic equation 
of R? are the squares of the roots of the characteristic equation of R. 
Our new method is thus brought into colligation with the classical 
root-squaring method of solving algebraic equations whose fundamen- 
tal principle is to increase the separation between roots. (6). The 
iterative process will in general converge rapidly only when the roots 
are well separated. 

In the use of the original iterative method by several workers it 
was observed that it was often impossible to determine the last digit 
accurately without carrying the iteration considerably further than 
at first seemed necessary, and of course using more decimal places 
than were finally to be retained. This difficulty largely disappears 
with the use of the method of the present note, since it is so easy to 
make the equivalent of 8, 16 and 32 iterations in a single operation. 
However it suggests the theoretical problem of finding limits of error 
in the determination of the coefficients and the k’s, in terms of the 
differences between consecutive trial values. This problem is very 
intriguing; but a solution valid with certainty under all circumstances 
appears upon consideration to be impossible. Indeed, as was pointed 
out in the earlier paper, if the trial values first taken happen to be 
the coefficients of the tests in a linear function of those whose corre- 
lation with y, is exactly zero, we shall never get y,, no matter how 
many times we iterate. If the correlation with y, is almost but not 
quite zero we shall usually seem to have convergence for a time to an- 
other set of values, the coefficients of y., but eventually the discrep- 
ancies between consecutive trial values will increase, and in the end 
the coefficients of y, will be approached. But although an exact limit 
of error is thus seen to be impossible if we insist on certainty, we 
shall attain to a very high probability of having the right limit if we 
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carry the iteration far enough to reach stability in three or four deci- 
mals; and this is easy when, as in the example below, an additional 
decimal place is obtained accurately at each trial. 

An additional safeguard against spurious convergence to the 
wrong principal component possibly useful in some cases would be 
to use two or more different sets of trial values. If all converged to 
the same result, it would be incredible that this was anything other 
than the greatest component. But of course the calculation of the 
later components, if carried out, would in any case reveal such an 
error. 

The symmetry of the matrices makes it unnecessary to write the 
elements below and to the left of the principal diagonal. The ith row 
is to be read by beginning with the ith element of the first row, read- 
ing down to the diagonal, and then across to the right. 

Each set of trial values is divided by an arbitrary one of them, 
which may well be taken to be the greatest. This division may well 
be performed with a slide rule for the first few sets, which do not 
require great accuracy. 


EXAMPLE 


The correlations in the matrix R below were obtained by Truman 
L. Kelley from 140 seventh-grade school children, and have been cor- 
rected for attenuation. (4). The variates, in order, are: memory for 
words; memory for numbers; memory for meaningful symbols ; mem- 
ory for meaningless symbols. At the right of each matrix (which are 
supposed to have the vacant spaces filled out so as to be symmetrical) 
is a check column consisting of the sums of the entries made and un- 
derstood in the several rows. 


Check 
MATRIX OF CORRELATIONS column 
1. .9596 .7686 5427 || 3.2709 
R— 1. .8647 7005 | 3.5248 
na 1. 8230 3.4563 
1. | 8.0662 

SQUARE OF MATRIX OF 

CORRELATIONS 

| 2.8061 2.9640 2.8136 2.3902 || 10.9738 
R: 3.1592 3.0435 2.6334 | 11.8001 
sch 3.0158 2.6688 || 11.5417 
2.4626 | 10.1550 
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|| 30.289 32.538 31.780 27.908 || 122.515 
re | 34.964 34.161 30.012 || 131.678 
— | 33.397 29.361 || 128.699 
| 25.835 || 118.115 
|| 8765 4046 3955 3476 15242 
ae 4349 4251 3736 | 16382 
| 4154 3651 | 16011 
| 3209 | 14072 








Multiplying the rows of R* by an initial set of trial values all 
equal to 1 we obtain the sums of the rows of this matrix, namely, 


15 242, 16 382, 16 011, 14 072. 


The last three digits of these numbers are unimportant. We can di- 
vide each of these values by the second, since this is the greatest, re- 
taining only a single decimal place, and multiply the values thus ob- 
tained by the rows of R*: in dividing this time we retain two decimal 
places. With the next iteration we retain two decimal places; with 
the next, three; with the next, four; and with all later iterations, five. 
The seventh and eighth sets of trial values thus obtained are exactly 
identical in all five decimal places; they are 


.93042, 1.00000, .97739, .85903 . (1) 
The products of this set by the rows of R° are 
14 402, 15 479, 15 129, 138 297. (2) 


Their products by the rows of R itself, divided by the second of them, 


are 
.93045, 1.00000, .97738, .85901 . (3) 


These remain exactly stationary under further iteration with R; their 
products by the rows of R are 


3.10744, 3.33972, 3.26418, 2.86884 . 


From the second of these values, which corresponds to the value unity 
in the preceding set, we have k, — 3.33972. From the second of (2) 
k,8 = 15 479. Hence, by division, k,7 = 4635.1. 

Multiplying each of the quantities (3) by the square root of the 
ratio of k, to the sum of the squares of these quantities, we obtain the 
correlations of the first principal component y, with the several tests; 


these are 
a .9013, A., = .9687, 3; = .9468, Qa, = 8321 . (4) 


These are also the coefficients of y, in the expressions for the tests in 





34 PSYCHOMETRIKA 


terms of the four principal components »,, y2, ys, and y,. 
The products of the four quantities (4) by themselves and each 
other are the elements of the matrix 
812 .8732 .8534 .7500 3.28 
.9384 9172 .8061 3.53 
.8964 .7879 3.454 
.6925 3.0365 || 


C.= 


The column at the right consists of the products of (4) by their sum, 
3.6490, and since it gives also the sums of the rows of C, provides a 
check. We next calculate, as a basis for the determination of y., 


1.1876 0864 — ie fs —Hi | 
0616 —0525 —.1056 —.0101 
Pies tO, ss 1036 0351 0014 
3075 .0297 


Upon multiplying each element of C, by the value previously 
found for k,’, and then subtracting from the corresponding element 
of R* without the necessity of further matrix squaring. In this par- 
ticular example, no element of R*, so far as this matrix was calcu- 
lated, differs by any significant amount from the corresponding ele- 
ment of k&,"C,. Hence we cannot use R,*° to determine y,. This condi- 
tion, however, points to rapid convergence with the matrix R,. In- 
deed, starting with the trial values 


—2, —l1, 0, 3, 


which are approximately proportional to the elements of the check 
column of R,, we find after only six iterations that 


k 2—— 202, Aye — —.4187, eo — —.2159, Azo ao 1551, Q42 —— 0284 , 


correct to four decimal places. This labor could have been slightly 
diminished by first calculating the matrix R,? — R? — k,C, 

The third principal component, found from R, and R.? with a 
total of seven iterations, is specified by k, — .1168, @,,; — —.0735, 
a,; = —.0637, a3; = .2818, a,; = —.1670 . 

In summary, it appears that the first principal component, which 
accounts for 83.5 per cent of the sum of the variances of the tests, 
and has high positive correlations with all of them, represents gene- 
ral ability to remember; the second, accounting for 13 per cent of the 
total variance, is correlated with memory both for words and for 
numbers in a sense opposite to that of its correlations with symbols 
of both kinds; and the third principal component, with 3 per cent of 
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the total variance ascribable to it, is most highly correlated with mem- 
ory for meaningful symbols. 

The foregoing calculations are carried to the maximum numbers 
of decimal places possible with the four-place correlations given. Not 
all these places are significant in the sense of random sampling. If 
only the small number (one or two) of places significant in the prob- 
ability sense, relative to the sampling errors of these 140 cases, had 
been retained at each stage, the number of iterations would have been 
reduced even further. 

Columbia University, 
New York. 
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THE RELATIONSHIP BETWEEN DEGREE OF ORIGINAL 
LEARNING AND DEGREE OF TRANSFER* 


HAROLD GULLIKSEN 


It has been repeatedly demonstrated that, in a learning situation 
involving a positive and a negative stimulus, the organism is frequent- 
ly reacting to the relationship between stimuli rather than to the 
absolute characteristics of the positive or negative stimulus.+ For ex- 
ample, if in the original training series an animal has learned to react 
positively to a ten and negatively to a five centimeter circle, he will, 
in the majority of trials, when confronted with a ten and a twenty cen- 
timeter circle, react positively to the twenty centimeter circle, that is, 
the larger of the two, and negatively to the ten centimeter circle which 
was formerly the positive stimulus. This is termed transposition of 
the relationship greater than from the training or original learning 
situation to the test or transposition situation. 

The present experiment was undertaken to determine the rela- 
tionship between the accuracy of the original learning and the ac- 
curacy of transposition. If the learning of the animal is primarily on 
a relational basis, then an increase in accuracy of the original learning 
would probably be accompanied by an increase in the accuracy of the 
transposition. If the absolute size of the positive stimulus is partly 
effective, however, then it might be expected that continued practice 
in responding to that particular size might be accompanied by a de- 
crease in the accuracy of transposition on the basis of relative size. 

The jumping technique for visual discrimination devised by 
Lashley (4) was used to train the rats. The stimuli used were two 
solid white circles, 18 and 12 cm. in diameter, on a black background. 
Two different criteria of learning were used, in order to test the 
effect of differences in the accuracy of original response. One group 
of rats was trained to a criterion of 10 consecutive errorless trials, 
and then given the transposition test. The other group was trained to 
a criterion of 30 consecutive errorless trials, and then tested for trans- 
position. The animals trained to 30 consecutive errorless trials have 
learned more accurately than those trained to 10 in the sense that 
the former group has made 10 errorless performances (just like the 

*My special thanks and appreciation are extended to Dr. Martin L. Reymert, 


the Director of the Mooseheart Laboratory for Child Research, for his encourage- 
ment in this work, and for furnishing the facilities for the experimental work. 


+See bibliographical references in Kliiver, (3). 
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latter group) and in addition has made at least 20 more errorless per- 
formances. The stimuli used for the transposition test were two solid 
white circles, 12 and 8 cm. in diameter, on a black background. Dur- 
ing the training period with circles 18 and 12 cm. in diameter one of 
the stimulus cards was positive, that is, when the animal jumped to 
that card, it fell down and he landed on the platform behind, receiving 
food. The other stimulus was a negative stimulus, that is, the animal 
jumping to it could not get through to the platform behind, fell into a 
net, and was made to jump again. During the transposition tests both 
stimuli were positive, in that the animal went through to the platform 
and received food regardless of which stimulus card was chosen. In 
other words, during the transposition tests, no punishment was given. 

Eleven animals completed this series of experiments; five of these 
were trained positively to the 12 cm. circle, and negatively to the 18 
cm. circle. Of these, two were trained to a criterion of 10 consecutive 
errorless trials,* and three were trained to a criterion of 30 consecu- 
tive errorless trials. Six of the animals were trained positively to the 
18 em. circle, and negatively to the 12 cm. circle. Three of these were 
trained to a criterion of ten, and three to a criterion of 30 consecutive 
errorless trials. The animals were given 10 trials a day except on the 
day that the criterion was satisfied, when they were immediately given 
the ten transposition tests, making 20 trials on the last day of work. 

The stimuli used in the transposition test were the 8 and 12 cm. 
circles mentioned above. If an animal trained positively to the 18 and 
negatively to the 12 cm. circles responded positively to the 12 as op- 
posed to the 8 cm. circle, it had reacted to the relationship larger than 
in both sets of experiments, that is, when presented with the test 
stimuli, it had transposed this relationship to a different place on the 
absolute size scale. Similarly, an animal trained positively to the 12 
and negatively to the 18 cm. circle transposed correctly if it responded 
to the 8 as opposed to the 12 cm. circle in the transposition test, that 
is, this animal reacted to the relationship smaller than in both sets of 
experiments. 

Each failure to transpose was called an error. The number of 
errors made by the animals in the 10 criterion group was compared 
with the number of errors made by those trained to a criterion of 30 
consecutive errorless trials. The average number of errors in the 


*A trial is defined in the following manner: The cards containing the 
stimuli were kept in the same position and the rat was put back on the jumping 
platform as often as he’ chose the negative stimulus and fell into the net. This 
was repeated with the cards in the same position until the animal chose the posi- 
tive stimulus. This series of jumps, with the cards in the same position, termi- 
nated by one correct jump is called one trial. Therefore it can be seen that the 
number of trials is the same as the number of correct responses. 
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transposition tests was 2.6 for the former, and 2.2 for the latter group. 
The overlapping was great as can be seen from Table I which gives 
the complete data including the total number of errors and trials made 
by each animal up to the beginning of the transposition tests. (The 
column labeled du/dw will be explained later.) 


TABLE I 


Summary of Experimental Data 











| Number of | | Total Total du/dw 
| consecutive | Number of errors trials (slope of 
Rat errorless errorsin | inthe in the learning 
Number | trials used |10transfer | learning learning curve where 
iascriterion | tests | series series training 
| of learning | ceased) 
| | 
3 | 10 2 78 100 .08 
5 10 4 41 110 15 
6 | 1 | 4 | 81 70 16 
10 | 10 2 | 56 80 06 
i} 10 | 1 42 90 09 
12 30 4 | 107 130 11 
13 30 6 67 120 09 
15 30 0 | 45 180 3 
1 30 0 55 240 01 
2 30 2 148 170 .03 
4 30 uy | 66 160 .02 








A comparison of the average number of errors in transposition 
made by animals trained to a criterion of 10 consecutive errorless 
trials with the average number of errors in transposition made by 
those trained to 30 consecutive errorless trials was made. The com- 
parison of these two groups by the use of the critical ratio, which is 
0.17, would lead one to conclude that there is no clear difference be- 
tween the two groups. However, another and more precise method of 
analyzing the data may be used. 

The number of consecutive errorless trials is an inaccurate cri- 
terion of learning, since such a criterion considers only the last 10 or 
the last 30 trials that the animal has made, and ignores all the rest of 
the learning record of that animal. It is possible to estimate the num- 
ber of errors per trial that the animal is making by computing the 
slope (du/dw) of the learning curve at the point where the training 
ceased. (2). The particular form of the learning curve used in this case 
is the plot of cumulative errors against cumulative correct responses, 
—each correct response being counted one trial. The slope of the 
learning curve at the point where training ceased is not to be confused 
with the total errors, the total trials, or the ratio between them. This 
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ratio is the average number of errors per trial over the entire learn- 
ing period. The difference can be seen by reference to figure I which 
gives a typical learning curve, plotted as cumulative errors (uw) 
against cumulative correct responses (w). The distance along the 
base line from 0 to 120 represents total trials; the vertical distance 
from 0 to 67 represents total number of errors made in learning. The 
average errors per trial is represented by the slope of the line OB and 
is 67/120. On the other hand, the quantity which represents the errors 
per trial when training ceased is du/dw, that is, the slope of line CD 
which is tangent to the learning curve at the point where training 
ceased. The slope of this tangent at the point where training ceased 
represents the relative accuracy of the animal’s response. The steeper 
the line CD the greater the number of errors per trial that the animal 
was making when training ceased. The flatter the line CD the more 
accurately the task has been iearned. 

There are several methods of calculating the value of du/dw. One 
of the simplest is the graphic method which makes use of the proposi- 
tion that w/w approaches du/dw, as u and w both approach zero, that 
is, at the origin. In order to use this method, simply tabulate the 
cumulative errors of any one animal (a sample case is shown in Table 
II) and subtract each entry from the total number of errors, giving a 
column of the number of errors still to be made, before the training 
ceases, that is, before the end of the training period. Do the same for 
the number of trials. (See columns headed T—w and E—uzu in Table 
II.) Compute the quantities (H—w)/(T—w) =r and plot these 
ratios against the E—z values. The last entries, where E—u — 0 
r — 0, should not be included in the plot. Then draw a free hand curve 
through these points projecting it to meet the E—z axis. The value of 
the ratio at this point is the slope of the curve at the point where 
training ceased. A check value of this ratio can be found by plotting 
r against T—w in the same fashion. If computational methods are 
preferred to graphic ones, methods of finite differences may be used. 
(5). 

The values labeled du/dw in Table I were found and checked by 
the graphic method, and represent the errors per trial that the animal 
was making when training ceased. This gives a criterion of learning 
which is dependent upon the entire learning record of the animal in- 
stead of on only a portion of it. The plot of this criterion against 
errors made in transposition is shown in figure 2. 

The straight line shown is a least square fit. The slope of the 
line is 27.2 and the standard error of the slope is 8.6, giving a t value 
(1) of 3.16. Fisher’s tables show that the probability of obtaining 
such a result, if the true slope is zero in the infinite population from 
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TABLE II 


Calculations for Determining Proposed Criterion 
of Learning (du/dw) for Rat No. 2 





Ww nr 





(Cumulative (Cumulative E—u | T—w (E—u) /(T—w) 
trials) errors) | | or 
| | . 

0 0 148 170 87 
10 Ag 121 160 76 
20 35 113 150 75 
30 70 78 140 55 
40 81 67 130 52 
50 94 54 120 45 
60 103 45 | 110 Al 
70 115 33 100 38 
80 126 22 90 24 
90 131 | a ys | 80 | 2k 

100 136 | 12 70 | AT 
110 141 fi 60 | Az 
120 144 4 50 .08 
130 146 | 2 40 .05 
140 148 0 30 

150 148 0 20 

160 148 0 10 

170 148 0 0 





T = total trials = 170 


E = total errors = 148 


which this sample is drawn, is between .02 and .01. This indicates 
that the slope is definitely different from zero which means that the 
more perfectly the animal had learned the original problem the more 
perfectly did he respond to the same relationship in the test stimuli. 

It is interesting to note that a more precise method of analysis 
of the data which takes into account the entire learning curve of the 
subject brought out a relationship which was obscured by the more 
usual method dealing with data of this sort. 


SUMMARY 


An experiment was performed to determine the relationship be- 
tween the accuracy of the original learning and the accuracy of trans- 
position. The usual method of comparison of the average number of 
errors in the transposition test made by a group of rats trained to a 
criterion of 10 consecutive errorless trials with the average number 
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of errors made by a group of rats trained to a criterion of 30 con- 
secutive errorless trials reveals no clear difference between the 
groups. 

However, the degree of learning at the point when training 
ceased, plotted against errors made in transposition, brings out a clear 
relationship between degree of original learning and accuracy of 
transposition. Within the range of learning tested, the more accurate 
the original learning was, the more accurate was the transposition. 





8 
RAT 13 has, bg, Seet.i 
70 Tk Cc P) 
° 
c 

60 
— 
= 50 ° 
[ee] 
= ° 
x 40 
uJ 
uJ 
= x 
i 
< 
= 
= 
= 20 
oO 

10 

oe 

20 40 60 80 100 120 140 





CUMULATIVE CORRECT RESPONSES (w) 


FIGURE I — Learning curve for rat No. 13. 


The curved line is the calculated curve. : 

The circles are experimentally determined points. . 

The slope of line OB equals average number of errors per trial. is 
The slope of line CD equals du/dw or the number of errors per trial when training 
ceased. 
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FIGURE II. — Relationship between proposed criterion of learning (errors per 
trial when training ceased) and number of errors in transfer tests. 
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THE USE OF THE DOOLITTLE METHOD IN OBTAINING 
RELATED MULTIPLE CORRELATION COEFFICIENTS 


ALBERT K. KURTZ 


It is well known that at the present time the Doolittle method is 
one of the most efficient methods of computing the multiple correla- 
tion coefficient between a criterion and several independent variables. 

This presentation calls attention to an extremely simple modifi- 
cation* of the Doolittle method by means of which (a) a single for- 
(n—1) (n—2) 

2 
multiple correlation coefficients instead of the usual one, or (b) the 
multiple correlation between each of several criteria and the same set 
of independent variables may be obtained with only a little more 
work than is needed to obtain the multiple correlation between these 
independent variables and a single criterion. It is even possible to 
compute the multiple correlation between several independent vari- 
ables and a criterion; and then to regard one of the former indepen- 
dent variables as a second criterion and compute the correlation be- 
tween the remaining independent variables and the new criterion. In 
all these cases the number of back solutions is equal to the number of 
multiple correlation coefficients desired, but a single forward solution 
suffices. 

The Doolittle method will not be described in detail, since this 
has been done elsewhere.} However, in order to illustrate the modi- 
fication here discussed, we shall carry through a Doolittle solution of 
a six variable problem. For reasons which will be apparent later, the 
variables have been arranged in the time order in which their values 
ordinarily become known. Three of the variables, X; X,, and X;, be- 
come known at practically the same time and they have been arranged 





ward solution will supply all the data necessary for 


*The modification discussed here was suggested by the writer and first used 
in 1928 by R. J. Wherry. It is possible that it may have been discovered and used 
elsewhere before that time. 

+See, for example: 

Ezekiel, Mordecai. Methods of correlation analysis. New York: John Wiley 
& Sons, Inc., 1980. pp. 362-367. 

Mills, Frederick Cecil. Statistical methods applied to economics and busi- 
ness. New York: Henry Holt and Company, 1924. pp. 577-581. 

Peters, Charles C. & Wykes, Elizabeth Crossley. Simplified metheds for com- 
puting regression coefficients and partial and multiple correlations. Jowrnal of 
of Educational Research. 1931, 23, 883-393. 

Smith, Bradford B. The use of punched card tabulating equipment in mul- 
tiple correlation problems. ‘Washington: U. S. Department of Agriculture, Bu- 
reau of Agricultural Economics, October, 1923 (mimeographed). pp. 24. 
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according to the size of their zero order correlations with X;. 

The data used in the example pertain to 487 men who entered the 
College of Liberal Arts of The Ohio State University in the Autumn 
Quarter of 1923. The basic data and the complete solution of the 
multiple correlation are given below. 


VARIABLES 


X, = high school] record 

X, = intelligence percentile 

X, = number of hours of A (high) grades received in first quarter 

X,=—number of hours of E (failing) grades received in first quarter 

X, = whether or not student was put on probation at end of first quarter 
(Yes = 1; No = 0) 

X, = number of quarters’ persistence in college (modified to give credit for 
graduation) 

X,= whether or not student was ultimately graduated (Yes = 1; No= 0) 


TABLE OF INTERCORRELATIONS 





| x xX, xX, xX, xX; X, X, 


1 





|X, High Schoo] +1.0000 + .3203 + .38690 — .3173 — .2940 + .3664 + .2604 
‘ae Intelligence -+ .3203 +1.0000 + .3252 — .38350 — .3209 + .2379 + .1339 
ix Hours of A + .3690 + .3252 +1.0000 — .2376 — .1911 + .2766 + .2872 
|X, HoursofE | — .3173 — .8350 — .2876 +1.0000 + .7842 — .5080 — .2544 
X,, Probation — .2940 — 3209 — .1911 + .7842 +1.0000 — .4204 — .1991 
| X, Persistence + .3664 + .2379 + .2766 — .5080 — .4204 +1.0000 + .6787 
|X, Graduation + .2604 + .13389 + .2872 — .2544 — .1991 + .6787 +1.0000 
| 








The reader will note in the following example that the signs of 
the correlations with the criterion were not reversed in the forward 
solution. The same effect was produced by reversing the signs of all 
the constant terms used in the back solution. The latter procedure 
has the advantage that exactly the same procedure can be followed, 
regardless of the number of criteria and regardless of the fact that a 
variable (as X,;) may be an independent variable at one time and a 
criterion variable at another. 

The § weights arrived at as a result of the back solution are 
checked in every one of the original normal equations. If the obtained 
6 weights check perfectly (or within, say, .0005 when the work is car- 
ried to four decimal] places) in all of the normal equations, we can be 
certain that the # weights are correct. Failure to obtain a perfect 
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FORWARD SOLUTION 


AT 





| 
wou | m | x x, x, | X, S: 
| +1.0000 | + .3203 | + .3690 | — .2173 | — .2940 | + .3664 | + .2604 | +1.7048 
—1.0000 | — .3203 | — .3690 | + .3173 | + .2940 | — .8664 | — .2604 | —1.7048 
| +1.0000 + .3252 | + .3850 | — .8209 | + .2379 | + .1339 | +1.3614 
| — -1026 | — .1182 | + .1016 | + .0942 | — .1174 | — .0834 | — .5460 
| + 8974) + .2070 | — 2834 | — .2267 | + .1205 | + .0505 | + .8154 
| —1.0000 | — .2307 | + .2601 | + .2526 | — .1843 | — .0563 | — .9086 
| | | 
| +1.0000 | — .2876 | — .1911| + .2766) + 2872 | +1.8293 | 
| — 2588 | + 1171 | + .1085 | — .1352 | — .0961 | — .6291 
| — .0478 | + .0538 | + .0523 — .0278 | — .0117 | — .1881 
| + .8160 | — .0667 | — .0303 + .1186 | + .1794 | +1.0121 
—1.0000 | + .0817 | + .0371 | — .1392 | — .2199 | —1.2408 
+1.0000 | + .7842 | — .5080 | — .2544 | + .1816 
| — .1007 | — .0933 | + .1163 | + .0826 | + .5409 
| — 0607 | — .0590 | + .0313 | + .0131 | + .2121 
| — .0054 | — .0025| + .0093 | + .0147 |, + .0827 
| + .8832 6294 | — .3511 | — .1440 | + .9673 
| —1.0000 | — .7554 + .4214 | + .1728 | —1.1609 
I 
+1.0000 | — .4204 | — .1991 | + .3587 
— .0864 | + .1077 | + .0766 | + .5012 
— .0573 | + .0304 | + .0128 | + .2060 
| —- -0011 | + .0042 + .0067 | + .0375 | 
— 4754} + .2652 | + .1088 | — .7307 
| + .8798 | — .0129 | + .0058 | + .8727 
| —1.0000 | + .0340 | — .0153 | — .9813 
+1.0000 | + .6787 | +1.6312 
— .1842 | — .0954 | — .6246 
| — .0162 | — .0068 | — .1095 | 
| — .0158 | — .0250 | — .1409 | 
| — .1480 |— .0607 | + .4076 
| — 0004 | + .0002 + .0127 
+ .6854 | + .4910  +1.1765 
—1.0000 — .7164 —1.7165 
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check in any one of the equations is positive proof that an error has 
been made.* 

After the § weights have been checked, Ry.123..., may be com- 

puted by the formula Ro. 123---n=-+ fS Boiass+--( )+++n Toi instead 
of by the more cumbersome formula which must be used for any 
weights other than exact § weights. 
In the illustration, R;z.12315. Was found to be +.6978. But since X, is 
not known until a student has finished his college career, it is of more 
practical interest to know how well X; can be predicted from the five 
variables which are known at the end of the student’s first quarter in 
the University. This multiple correlation, R;.123:;, can be obtained 
from the same forward solution that was used in obtaining R; 123456. 

Let us examine the forward solution used in obtaining R,.123456 
and see how it would be affected if the variable X, had not been in- 
cluded. The column headed X, would be missing as would also the 
eight rows in the last section of the computation. All the figures in 
the S column would be changed, but since the S column has already 
served its purpose as a check and is not used in the back solution, the 
changes in the S column may be ignored. No other changes would 
have been produced by the omission of X,. Consequently, the appro- 
priate 6 regression weights and R;.123,; can be computed by simply 
ignoring the figures in the column headed X, and in the last eight 
rows. This gives: 





Brs.1234 — +.0153 
Br4.1235 = —.7554 Br5.1234 —.1728 —= —.1844 
Brs.124s = +-0817 Brs1225 +-0371 Brs.r204 + -2199 = +.2054 
Bre.asas = = —.0352 
Br1.2345 =*** =-+.1419 


and | ee = + V+.1351 = +.3675 . 


*After carrying through the Doolittle method several times, an accurate 
computer would do well to consider eliminating the check column which is usually 
included in the forward solution. In the example given here, one-fifth of the re- 
corded figures are in the check column and more than one-fifth of the computa- 
tion time would be saved by the omission of this column, Checking the obtained 
B weights in the original normal equations checks both the forward and the back 
solutions. If one of the equations fails to check, several possibilities are open: 
(1) the error may be further localized by computing @ weights for predicting 
the criterion from only part of the independent variables (as explained later) ; 
(2) the check column may be computed and used to locate the error; (3) the 
error may be looked for in the vertical and horizontal sections of the forward 
solution dealing with the variable common to all the terms in the normal equa- 
tion that fails to check; or (4) the error may be looked for in the back solution. 
Sign errors are common, but they can be practically eliminated from the forward 
solution by always writing down the entire row of signs before each series of 
multiplications. 
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By similarly eliminating from the forward solution the column 
and the seven rows dealing with X; in addition to those dealing with 
X,, we can obtain R;..23;,. If, in addition, the data pertaining to X, 
are eliminated, we can obtain R,...;. Finally, the additional elimina- 
tion of the computations relating to X, will give R;.,.. These correla- 
tions* are: 


R;.123456 —=-+.6978 Prediction made when persistence is known. 


R,1234, —=-+.3675 Prediction made when students are placed on 
probation. 


Ri1234 =-+.3674 Prediction made when first quarter A and E 
grades are used. 


R,3.23 =-+.3319 Prediction made when first quarter A grades 
are used. 


R,12 + =-+.2658 Prediction made when intelligence test is 
scored. 


Pei —-+.2604 Prediction made at time of high school gradu- 
ation. 


Another examination of the forward solution will reveal that our 
criterion differs from the other variables only in the fact that it is 
last (except for the check column). If the entire column devoted to 
it were omitted, X, could then be regarded as a criterion and Re.12345 
could be computed just as R-.1234; was computed, variable X, replacing 
X, as the criterion variable. The § weights are: 


Bos.1284 = —.0340 
Bes.1235 — —.7554 Bes.1236 —.4214 = —.3957 
etc. 


and inaskak oe +.5609. 


By analogous methods, other multiple correlation coefficients such 
as Re.2s3 = +.4077 and R;.12 = +.3791 may be obtained. In fact, 
when the variables are numbered and arranged as they were in the 
preceding example, the same forward solution can be used in obtain- 
ing the multiple correlation between any variable regarded as a crite- 
rion and any consecutive group of variables numbered from one up 
to (but not including) the number of the criterion variable. With 


*These correlation coefficients are carried to four decimals for purposes of 
comparison only. Since their standard errors vary between .02 and .04, only the 
first two figures are significant. 
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seven variables, fifteen multiple correlations can be computed from a 
single forward solution; and, in general, where n is the total number 


of variables, there are heme ——— 





such multiple correlations. 


These are not all of the possible multiple correlations among the vari- 
ables, but they are apt to be the most important ones if the variables 
are arranged in order of the time at which they become known, the 
size of their zero order correlations with the criterion, the cost of ob- 
taining the scores on each of the independent variables, or according 
to some other significant characteristic. 

If several criteria are used, and it is not desired to predict any 
one of the criteria from a knowledge of the others, the forward solu- 
tion is stopped one or more sections before what would otherwise be 
regarded as the end. For instance, if Ry.1.345 and Re¢.12345, but not 
Ry 423456 OY Ro123457, are desired, the last eight rows of the forward so- 
lution illustrated could be omitted. One more column, but no extra 
rows are needed for each additional criterion variable included in the 
study. 


The Procter and Gamble Company, 
Cincinnati, Ohio. 
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OBTAINING A COMPOSITE MEASURE FROM A NUMBER OF 
DIFFERENT MEASURES OF THE SAME ATTRIBUTE 


PAUL HORST 


It frequently happens that we wish to measure the magnitude of 
each of a series of entities with respect to some particular attribute. 
We may have a number of different measures on each member of the 
series. Each measure may be regarded as a valid measure of the at- 
tribute whose magnitude we wish to determine. Our problem is to 
combine the separate measures for eacli member of the series into a 
single composite measure. 

For example, the members of the series may be students. We 
may wish to measure the scholastic ability of each student. Suppose 
that we have a number of measures of scholastic ability on each stu- 
dent, such as semester grades, achievement test scores, intelligence 
test scores, etc. How shall we combine these measures into a single 
composite measure of scholastic ability for each student? 

Again, suppose we have a group of industrial employees. On 
each employee we have various measures of efficiency such as attend- 
ance, tardiness and sickness records, production records, ratings of 
various types by supervisors, and so on. How shall we combine these 
various measures into a single composite measures of efficiency? 

Or to take an example from economics, the members of our series 
may be various geographical sections or they may be members of a 
time series all of which represent the same geographical unit. Sup- 
pose that the measure we wish to obtain for each unit is an index of 
industrial activity, business prosperity, or what not. For each mem- 
ber we may have a number of separate measures, such as income tax 
returns, volume of production, retail sales, etc. How shall we combine 
these into a single index of prosperity? 

Evidently, for each of these examples there are various methods 
of obtaining a single composite measure from the separate measures. 
We shall begin, however, by assuming that the purpose in obtaining 
the separate measures on each member of the group is to find to what 
extent the members differ from one another. We shall assume further 
that we have no a priori basis on which to decide which of the meas- 
ures are more valid for measuring the attribute we wish to study. 
These two assumptions lead us to the basic assumption underlying 
our method for combining the separate measures into a single com- 
posite measure, viz., the separate measures should be combined in 
such a manner that the composite measure will result in giving the 
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maximum difference between all possible pairs of members in the 
group. 

Suppose that on a population of N cases we have n measures for 
each individual represented by the variables, z,, Z2,--- , Zn. Let these 
measures be in standard deviation units so that the mean value of 
each variable is zero and the standard deviation unity. 

Our problem is to develop a method for combining these meas- 
ures for each individual in such a way that the composite scores will 
give us the greatest possible discrimination between all possible pairs 
of individuals in the group. We might specify that the sum of the 
absolute differences between all possible pairs of scores shall be a 
maximum. Formulated in this manner, however, the problem does 
not admit of a unique solution. But if we stipulate that the sum of 
the squares of the differences between all possible pairs of scores 
shall be a maximum, a unique determination is available. This means 
simply that we want the standard deviation of the composite scores 
to be a maximum. 

First we specify that the composite measure shall be a linear 
function‘of the variables. Thus 

S=C(Q, 2 + Me 2 + +++ + Gn Zn) (1) 
where S is the composite measure, the a’s are the weights to be as- 
signed to each variable and C is some function of the a’s. The reason 
for C may be explained as follows: 

It is obvious that we can make the standard deviation of S as 
large as we wish simply by increasing the magnitude of the a’s. Thus 
it is not the absolute values of the a’s but rather their relative val- 
ues which are significant. For this reason C should be some decreas- 
ing function of the a’s. Or if we let C= 1/K then K shall be some 
increasing function of the a’s which offsets any arbitrary propor- 
tional increase in the size of the a’s. We might let S be simply a 
weighted average of the type 


O42 + AZ + +++ + An2n 


S n 
> 4 





which is the conventional type of weighted average. But if in our 
determination of the weights some of them should come out positive 
and some negative } a; would decrease in size as the sum of the neg- 
ative values approached the sum of the positive values. Clearly as 
> a; became smaller the standard deviation of S would become larger, 
which means that this standard deviation would be largely a function 
of the signs of the weights. We wish therefore to make o, indepen- 
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dent of the signs of the weights. This we can do readily by making 
K a function of the squares of the weights. We take as a plausible 


function 


k= J > a;? (2) 
so that from (1) and (2) we have 
Aye, Ar2%>s —- An&n 
gS + ee Anz (3) 








VG? + Aa? 4 +++ Dy? 
Our problem now is to determine the a values in such a way that o; 
will be a maximum. If o, is a maximum, obviously a,” will also be a 
maximum. From the formula for the standard deviation we have 


gu if ) (4) 
N N 


But evaluating the second term in the right hand side of (4) by 
substituting from (3) we have 


© 2) _ [cel A2%2 + needs ++ ne) | 




















N Ny a;? 
N N N 2 
( 2 2s ks rine) 
Ay 1 a uy see dy 1 
im N + As N ree dy N 
ia > a; 
__ (@,M, + aM, + +++ + a,M,) , 
N 
> a? 


2 


N 
and since by definition all the means are zero cS vanishes and we 
: N 


have simply 
N 
al (5) 





If now we evaluate the right hand side of (5) by substituting 
from (3) we get 


N 
2 (42, + Ao%_ f--++ + Gata)? 
N (a,? + Cl," + ey + Qy”) 


os = 
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or expanding 


0,72,? — 0,2 ,2. +++ + A, OnZ12y 
4 + 0204222, + My?Zo?7 eves LA AnSohn 
(6) 


— An Ay 2n21 = AnQzenZo = al — On? Zn? | 





N (a,? + a,” + soot Gn?) 
Now the summations within the brackets in (6) are of the form 
22;2;. If i= 7 we have 
™ 
p> rg = No}? = N 


1 


If i4 7 we have (7) 
y 
2:2; = Noio;?i; = N7ri; 


since all standard deviations are by definition unity. 


Then if we substitute the values (7) in (6) we get 





a,” ot DMP 2 terete AAnTin 
+ @2A,%21 9 + a," tit AGT on 
ce ae ea. Ge. og. dag “my ae (8) | 
+ OnQy?ny + AnA.lng terre + a, 
Os” ad 
a, + a,’ + ----+ @,? 
Next let us define 
yp (a) a os” 
g(a) = the numerator term in (8) (8a) 
f(a) = the denominator term in (8) 
and rewrite (8) 
9 
yo 9 
pars (9) 


We wish to determine the 2’s so that y will be a maximum. To 
do this the partial derivatives of y with respect to the a’s must van- 
ish, i.e. 


ay ay ay 


—=-—(0, =@ ,---, ==( 








Ca, 00» Ody ( 10 ) 
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First we take logarithms of both sides of (9). 
log y = log y — log f 
Differentiating both sides of (11) we have 
dy dp af 


> - 7 
or 


a7 Y 

— dy= dg —— d 

ia dep 7 f 
From (8a) we have 


op op 
do = — da, eee — da, 
lp 4 pvennye Me 
and 





Q 3 
iineee day fo a. 


0a, n 


Substituting (13) in (12) gives 


For EF into Fill 


From (10) and (14) we have 


a 
a. 2 


Oa, Oa, f Oy 
But from (8) and (8a) we see that 


oa = 2 (Gy + Gero + +++ + Onin) 





op = 2 (G7 n1 + QP no + et + Gn) 





OAn 
and 
of of 
mo se Faas 


From equations (15) and (16) 


57 


(11) 


(12) 


(13) 


(14) 


(15) 


(16) 
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oy Y 
th ag, = cay Qe 12 See +- Ontin = 0 
ow : 
\% = A, N21 a As ( —t ) + sae sweaewk + 0,1 on = 0 
Vy A = O Tn “+ A21 ne : aie a(1—$ in 
(17) 
The determinant of equations (17) is 
Y 
1—-= Vie bere Tin 
f 
Yo1 1 amy oar Ton | (18) 
Axs| f | 
y | 
Tn Tne ve LS) 
1 f | 





In order that there be other than the trivial solution a, = a. = 
-- == 0 the determinant A in (18) must vanish. This means that we 


must determine ” so that the determinant vanishes. Obviously (18) 


may be expanded into a power series in -- or y since y = — (see 


(9)). This polynomial will be of degree n. Thus we may write 


A+=C,4+ Cw Cay’?+.---+Ciy"=0. (19) 


It may be shown that (19) has 7 real roots. Since we wish to maxi- 
mize wy we select the largest root in (19) as our value of y. Substi- 
tuting this value of y in (17) we may by means of the first (n—1) 
equations solve for the first (n—1) a’s as proportions of a». 


Since the ratio y = ee is dependent only on the relative values 


f 
of the a’s any values of the a’s will yield the same value of y so long 
as the ratios remain unchanged. 

These weights then will guarantee that the composite scores cal- 
culated from them will give the maximum discrimination between all 
possible pairs of individuals in the population. 

The method outlined gives the theoretical solution to the prob- 
lem. However, in actual practice we encounter a great deal of nu- 
merical labor if the number of variables exceeds four or five. (In one 
project carried out by the writer the number of variables was 130.) 



































PAUL HORST 59 


In the first place the determination of the coefficients of the poly- 
nomial (19) is a laborious task. In the second place the solution for 
the roots of this equation requires an approximation procedure and 
is very lengthy with more than 6 or 7 variables. In the third place 
once the largest root of (19) has been found we have still to solve 
for the a values by some process such as the Doolittle method. This 
in itself is tedious for more than 10 or 12 variables. 

Fortunately, however, an approximate method is available which 
gives values for the a’s very close to those obtained by the exact 
mathematical method and at the same time effects a tremendous sav- 
ing of labor. The greater the number of variables the more nearly 
will the two sets of values agree. The method is as follows: 

Consider the matrix of the intercorrelations of all the variables, 





say 
| 1 Tis <> Tig 
r= | ta 1 ++ Ton (20) 
oo | 








This matrix is the same as the matrix of equations (17) except 
fh 


f 
terms. 
To get approximate values proportional to the a’s or weights we 


have merely to take the sums of columns. Thus 
a, = CDi ’ a, = CS ie aes On = CD in ° (21) 


that the value (— = yw) has not been subtracted from the diagonal 


These weights, however, are the ones to be used in case the variables 
are given in standard deviation units. The weights to be used in case 
the original raw units are used may be derived as follows. Let a 
standard measure be defined in the usual manner. 








tin es =. (22) 
We have then 
S = 04,2, + Aphe + +++ + An2n (23) 
or substituting (22) in (23) we have 
Gert S, 4 2S. 4+- 22,42 (24) 
01 02 on 
. 2 aM; 
where K=— >— 
1. 


The value K may be neglected, however, since it does not change the 
relative order of the composite measures. 
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We may now substitute the values of the a’s given by (21) in . 


(24). Neglecting C in (21) and K in (24) we have 


Tix zie Din 
- Xi+ Xo eee 


O71 C2 on 








5 om Xn (25) 





Equation (25) gives the weights to be used in deriving a composite 
score from the measures given in terms of the original units. These 
composite measures approximate the maximum dispersion obtainable 
from a linear combination of the original measures, so that the sum 
of the squares of the differences between all possible pairs of meas- 
ures is a maximum. 

To illustrate the method numerically we may take some data 
used in arriving at composite measures of efficiency for a group of 
industrial employees. On each individual in the group three separate 
measures of efficiency were available. These were: 


(1) Length of Service. 
(2) Efficiency ratings derived by comparing each man in the 
group with every other man in the group. 


(4) Efficiency ratings derived by checking for each man a de- 
scriptive rating schedule. This schedule contains a series 
of scaled statements describing various degrees and qual- 
ities of performance on the job. 


The table of intercorrelations of the three measures is as fol- 
lows: 








1 2 3 
1 1.000 399 194 
2 300 1.000 288 
3; .194 288 1.000 


( 


For the standard deviations we have 
o,=2.376 o2==1.839 o;=.587 
and for the column summations of the table of intercorrelations, 
Sr, = L527 , Sra =1621 , Sree ile. 
Substituting these values and those of the o’s in (25) we get 
S = .643 X, + .881 X. + 2.525 X; . 


The raw measures when substituted in this formula gave the com- 
posite measures of efficiency. 


The Procter and Gamble Company, 
Cincinnati, Ohio. 
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THE BOUNDING HYPERPLANES OF A CONFIGURATION 
OF TRAITS 


L. L. THURSTONE 


The purpose of this paper is to describe one of the methods of 
analyzing a configuration of psychological tests or traits in the prob- 
lem of isolating primary factors. In factorial analysis a set of traits 
or tests is represented by a set of radial vectors which are so related 
to each other that the scalar product of each pair is equal to the cor- 
responding experimentally observed correlation coefficient. Each 
radial vector represents a trait. The dimensionality of the trait con- 
figuration is the rank of the reduced correlational matrix. The re- 
duced correlational matrix has communalities in the diagonal cells.* 

If there are n trait vectors in the configuration and if the dimen- 
sionality is r, then each of the n traits can be described as a linear 
function, a weighted sum, of any set of r linearly independent vectors. 
It is assumed here that n > r. These x traits in terms of which one 
may describe or comprehend each of the traits in the configuration 
have been denoted reference traits. 

As far as the mathematical problem is concerned, the reference 
traits may be arbitrarily chosen so long as they are linearly indepen- 
dent. The scientific problem demands more than an arbitrary set of 
reference vectors. In the scientific problem of isolating fundamentally 
significant traits, it is, of course, essential that the basic traits be 
uniquely determined. 


Consider a rectangular n < 7 table V of weights. The r columns 
represent the r unit reference vectors, and the n rows represent the n 
traits in the configuration. Let the r entries in any row j represent 
fractions of the r unit reference vectors. The vectorial sum of these 
fractions produces the trait vector j. The simplest linear comprehen- 
sion of the 7 traits in the configuration is attained when the reference 
vectors are so chosen as to maximize the number of vanishing entries 
in the table V. Such reference vectors have been denoted primary 
vectors and the traits which they represent in relation to the config- 
uration have been denoted primary traits. If a set of primary traits 
with a conspicuous number of vanishing entries in V can be found in 
terms of which the whole trait configuration can be comprehended, 


*L. L. Thurstone, The Vectors of Mind (Chicago: The University of Chicago 
Press, 1935), p. 66. 


— = 
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then the primary traits should be of basic scientific significance. The 
combination of the trait configuration and such a set of primary ref- 
erence traits has been called a simple structure. The primary unit 
reference vectors may be regarded as the coordinate vectors for the 
configuration and the entries in the matrix V are then the oblique co- 
ordinates of the 7 traits in the configuration. 

If the trait configuration consists of psychological tests, then the 
co-ordinate vectors represent primary abilities in terms of which the 
test abilities may be conceived. Since it is not likely that the primary 
abilities enter negatively into psychological tests, except rarely, it is 
to be expected that the coefficients in the corresponding matrix V will 
be all positive or zero, and that the only negative coefficients that may 
occur represent chance deviations from zero or small negative devia- 
tions from zero which are attributable to the ignoring of residual 
primary traits of low variance in the tests of the battery. 

The interpretation of positive and negative signs in a row of V 
may be illustrated graphically for the special case where r is 3. In 
Figure 1 let the intersections X, Y, Z, represent the termini of three 


ae 


FIGURE I 


unit reference vectors and let the surface of the diagram represent, 
diagramatically, the surface of a unit sphere. Any test vector which 
is describable as a weighted sum of the reference vectors, X, Y, Z, with 
all positive and non-vanishing weights pierces the surface of the unit 
sphere within the triangle XYZ. If only the first co-ordinate is nega- 
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tive, then the test vector lies in the space (— + +). Similar interpre 
tation applies to the spaces (+ — +) and (+--+ —). Ifa test vector 
lies in the plane of X and Y, then its third coordinate vanishes so that 
the plane sector XY may be denoted (+ + 0). A test vector that is 
co-linear with the reference vector X has the coefficients (+ 0 0). The 
other spaces have similar interpretation. The space (— — —) is 
diametrically opposite to that of (+ + +). 

If the reference vectors are to represent primary abilities it is 
to be expected that there will be no significant negative entries in V. 
The test configuration should then be confined to the triangular space 
XYZ. Furthermore, if the test configuration can be represented in 
the form of a simple structure, then there will be at least one zero 
entry in each row of V and the entire test configuration is then con- 
tained in three bounding planes of the positive space XYZ. Each test 
vector will then belong to one of three types of spaces, namely, 
(0+-+), (+ 0+), (4+ + 0), depending on whether the first, the 
second, or the third primary ability is absent in the test, unless it re- 
quires only one of the primary abilities, in which case it belongs to one 
of the types, (+00), (0-+ 0), (0 0-). In the latter case the test 
vector is co-linear with one of the co-ordinate axes. 

For any test configuration of dimensionality 7, it isa question of 
fact whether a set of 7 reference vectors can be found such that the 
entire test configuration is contained in the 7 bounding hyperplanes. 
A principal object of a factorial analysis is to find these bounding 
hyperplanes whose radial intersections constitute the primary vectors. 
These, in turn, define the primary traits. Each of the bounding hyper- 
planes has the psychological significance that there is one primary 
ability that is absent from all the tests whose vectors are contained in 
the hyperplane. Each of the bounding hyperplanes is characterized 
and defined by the primary ability that is absent in it. A bounding 
hyperplane is one on whose normal each of the test vectors has a pro 
jection that is either positive or near zero. In general it is essential 
for a unique determination that each hyperplane contain a fairly large 
proportion of the test vectors which therefore have nearly vanishing 
projections on the normal to the hyperplane. 

If a subgroup of tests has been selected in all of which one par- 
ticular ability is absent, and if all of the remaining (7—1) abilities 
are represented in the subgroup, then one of the bounding hyperplanes 
is defined by the fact that the sum of the squares of the projections of 
the test vectors in the subgroup is a minimum. If the subgroup cannot 
be listed on the basis of hypotheses the bounding hyperplanes may be 
found by successive approximation in which one maximizes the num- 
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ber of nearly vanishing projections of the test vectors, subject to the 
condition that appreciable negative projections must be excluded. 
Such hyperplanes with a large proportion of nearly vanishing pro- 
jections are extremely unlikely in an arbitrary configuration of vec- 
tors. By successive approximation the subgroup may be adjusted for 
each trial. For each trial all tests with negative projections as well 
as all test vectors with positive projections up to, say, +.20 or +.30 
are included in the trial subgroup. When the solution is approached, 
as indicated by a large proportion of test vectors in the range +.10, 
and by the absence of negative projections greater than —.15 or —.20, 
the subgroup is taken to include all tests with projections below +.20 
or below +-.15. 

One method of successive approximation is as follows. Let A be 
a trial vector which defines the hyperplane L. Then A is normal to 
L. See Figure 2. In the subgroup, let j be a test vector whose pro- 








jection on the trial vector A is the scalar product (A-j) = v;. The 
ae 
aed oo 
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FIGURE II 


scalar v; may be written* in the form 


rT 

So .7 

Vj — > Qim Amp ’ 
m=1 


and it may be represented in vector notation; as 
v;== (A-§) . 


If the test vector j lies in the hyperplane L its projection on A is, of 
course, zero. If there is only one test vector, then A can be adjusted 
so as to become orthogonal to j by means of a single correction vector 


*Ibid., p. 155. 
{Vectors are here shown in bold face while scalars are shown in italics. 
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which may be taken colinear with j and of the opposite, sign, or co- 
linear with —(A-j)j . 
Let the sum of such individual corrections be defined as 


1) Ae Zvj=z (s-ji, 


where A is a non-normalized vector in the plane of A and B. Let B be 
a unit vector in the plane of A andA, and perpendicular to A so that 
B-A=0. The projection of j on B is the scalar product z; =B-j. A 
new trial vector C is to be taken in the plane of A, A, and B, on which 
the sum of the squares of the projections of j is minimized. Let these 
projections be 


2) w=C-j. 
Since C is to be coplanar with A and B, we may write 
3) C= mA- 9B , 
where m and p are parameters to be determined. 
Hence 
4) w= (mA - pB) -j , 


or 
w= mA-j+ pB-j. 


Substituting v and z for the scalar products, 


5) w=mv-+ pz. 
The square of the projection w is 
6) w? = mv? + 2mpvz +- p?2? . 


For the subgroup, the sum of the squares of the projections of j on 
C is 
7) u => w? = mv? + 2 mpSvze+ v’Dd2 . 

j d i] j 


The partial derivatives of « with respect to the parameters p 
and m are 


Ou 

8) — = 2m>v2z + 2p>d2? » 
op j T j 
ou ‘ 

9) a 2mzs" +- 2pve ‘ 


The conditional vector equation is 
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10) (= C—1=—0 , 
which by (3) becomes 

11) C= (mA -+ pB)?—1=0, 
or 


C= mA? + 2mpa-B-+ p*B?—1=0. 


Since A and B are unit vectors, 


A? §*-—1 ; 
and since A and B are orthogonal vectors 
A-B=0. 
Hence 
12) C= m+ p?—_1=—0, 
so that 
ot 
13) >> 2p » 
ot 
14) ss 2m - 


The normai equations for m and p are 


a ye 


ou OG 
15) sp + B “io 0, 
OU oc 
wid tn to =? 
where # is a Lagrange multiplier. From 8, 9, 13, and 14, we have 
17) p[Sz* + 6] + myvz=0, 
¥] j 
18) pYvz + m [Sv + p]=0. 
I J 
EKiiminating f, the ratio of m and p may be found as follows: 
19) S24+—Sz=—— 5, 
j P j 
P Suz + Sv? = — p , 
m J j 


so that 








eed 
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20) Suz 4+ Det = Sv? + PP swz . 
i i i ™ ; 
Let ~ = s. Then 
p 
21) soz + Se — So? —L vz = 0 , 
j j j sj 
or 
22) s’Svz+s8 [ 2 — 20") —DYvz=0. 
j j j j 


Solving this quadratic with negative sign for the discriminant gives 
a value for 


m 
s=>=—;, 
p 
so that 
23) poem. 
Normalizing m and p by substitution in (12) gives 
vi-+s 
and 
V1-+s° 


The value of m is positive since it is the coefficient of A in (3) and p 

is negative since A is to be adjusted toward orthogonality with the 

vectors j in the subgroup whose weighted sum is A. See Figure 2. 
Before the new trial vector C can be determined, the vector B 

must be found. It can be expressed as a linear function of A and A. 

Hence 

26) B=—zaA-+ yA , 

where x and y are two parameters. By definition of B we have B-A 

= 0 and hence 


27) (wA + yA)-A=0, 
or 
28) va?-+yA-A=0. 


Since A is a unit vector, we may suppress A’, and then 


29) x%=—yY(A-A). 
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Since B is also a unit vector, we may write 











30) (xA-+ yA)?=1, 
or 
31) v’?+t 2ryA-A+ yA?=—1. 
Substituting (29) in (31), 
1 

oe = > 

oe VR — (aA)? 
and hence 
38) — 


tS ’ 
VA? — (A-A)? 


in which y is always positive while x may have either a smaller posi- 
tive or negative value. By (82), (33), and (26), the vector B may 
be determined since A is a given trial vector and A is also known from 
(1). 

The vector C in (3) is then taken as the new trial vector A and 
the successive adjustments are made until the angular separation be- 
tween A and the next trial vector C is small enough so that it may be 
ignored. The determination of each bounding hyperplane of the trait 
configuration can be made independently as here described for the 
general case in which it has not been imposed that the bounding hyper- 
planes be mutually orthogonal. A small number of approximations 
depends largely on successful initial judgment in selecting a tentative 
subgroup which is revised by each successive approximation until it 
becomes of rank (7—1). 

The present method has other uses besides that of finding a mean 
principal axis in factor analysis. If there are n vectors in n dimen- 
sions the table of coordinates is a square non-singular matrix. If it 
is desired to find the inverse of a matrix, the problem can be thought 
of as that of finding the mean principal axes of n subgroups. The first 
column of the inverse represents the n direction numbers of the termi- 
nus of a vector which is orthogonal to the (n—1) vectors that are 
represented by all but the first row of the the given matrix. Each 
column of the inverse may be found in the same manner. 

University of Chicago, 
Chicago, Illinois 
March 18, 1936 
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NOTES ON THE RATIONALE OF ITEM ANALYSIS 
M. W. RICHARDSON 


Item Validity 


There is increasing use of item analysis procedures for the 
improvement of objective examinations. The development of the pro- 
cedures of item analysis has consisted chiefly of the invention of vari- 
ous forms of an index of association between the test item and the 
total test score. At least ten indices of item validity have appeared 
in various articles, which have been chiefly concerned with the rela- 
tive effectiveness of the indices as devices for the improvement of 
tests. (4, 5, 6, 10). Since these indices of “item validity” are substi- 
tutes for or approximations to the ordinary coefficient of correlation 
between the item and the total test score, it may be useful to present 
certain deductions from simple correlational algebra. The present, 
writer is of the opinion that the ingenuity displayed in the invention’ 
of new indices has outstripped the critical examination of the logical 
foundation for item analysis. The subsequent discussion is therefore 
concerned only with the underlying rationale of item analysis. 

The first step in the description of item analysis procedures is 
to express the item-test coefficient in terms of the item intercorrela- 
tions. A test score t is defined by the equation 


t=2,+%+4;+----+a, , (1) 


where t is the deviate score on the test, and the x’s are the deviate 
scores on the items, which are n in number. This definition embodies, 
of course, the usual practice of summing the unit or zero scores on 
the separate objective items to obtain the total test score. Let us 
take r;;, the correlation between any item i and the test ¢ as a meas- 
ure of item validity. 


Then 





(2) 


where o; is the standard deviation of item i, and o; is the standard 
deviation of the test scores. The general subscript i means that the 
formula applies to any item of a given test. The summation is over 
the population N. : Substituting in (2) the value of ¢ from (1), we 
have 


—_$9— 
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D Xi (4, + w+ +++ + Ln) 











4e een (3) 
2 Xie, + Dd Lite +--+ DS Lite 
7 Sew, 
_N i010, + N rigoion +--+ N Tinsion 
fal Nojor 
If we now assume that 
oi 9) oe — 0 —— +> * 5 


which is rigidly true when all items are of the same difficulty as meas- 
ured by the percentage of correct response, and approximately true 
for a wide range of difficulty, we have 
No?Srx 1d ue (4) 
k=1 k=1 


rit = 


aoammaes ’ 





Noiot Ct 


in which the summation is over the » correlations of item i with each 
of the n items in turn. In order to further simplify equation (4), the 
standard deviation of test scores o; will be expressed in terms of the 
test elements. Squaring (1), and summing, we have 


St = oe ig a ye + nda a be Pg +2 D21%2 oo —e + 2 Stites 
fee +2 S%n Ly = No? = No? S Sri ° (5) 
i=1 k=1 
The double summation indicates that all item intercorrelations 
are taken. We can simplify (5) to 


n mn 





wma 3S re, © 
Substituting this value of o; in equation (4), we may write 
» Vik 
Tit = vi ° (7) 


Equation (7) expresses any item-test correlation as a function 
of the item intercorrelations. As applied to any item 7 of a test ho- 
mogeneous in difficulty, the item-test correlation is equal to the sum 
of the correlations of that item with all items of the test, divided by 























M. W. RICHARDSON 71 


the positive square root of the sum of all item intercorrelations. (In 
any actual test, the denominator of (7) will not be imaginary). Since 
the denominator is constant in any situation where item analysis pro- 
cedures are employed, it can be concluded that: 

In a test of uniform difficulty, the correlation of an item with 
the test is proportional to the average correlation of that item with 
each item of the test. 

Since the item intercorrelation coefficients themselves form a dis- 
tribution, it may be concluded that: 

The rejection of items whose correlations with the test are rela- 
tively low raises the average intercorrelations of the remaining items. 

The formal similarity of equation (7) to Thurstone’s expression 
for the first factor loading for the Centroid Method is not accidental. 
(9). The first factor loading on the centroid is a measure of the cor- 
relation between a test and the sum or average of the tests in the 
battery. A similar interpretation may be made in the item analysis 
situation. The item-test coefficient measures the correlation between 
a variable (the item) and the sum or average of many such varia- 
ables. In this context, the item-test coefficient is the “factor” loading 
of the item with an arbitrary test variable which is the sum of the 
items. These considerations make it possibie to conclude that: 

The item-test coefficient gives an indication of the extent to which 
the item measures what the test as a whole measures. The item-test 
coefficient merely tells whether or not an item is in step with other 
items of the test. 


Item Validity and Test Reliability 


If we assume, as in the foregoing, equal difficulty of items, the 
Spearman-Brown Formula might be used to estimate the reliability 
of a test of n items from 7, the (average) correlation between two 
items. This is significant in connection with the effect of rejection of 
items with low item intercorrelations upon the reliability of the test. 
Let us take 7x, the average item intercorrelation as a measure of item 
reliability. Equation (7) gives the expression for any item-test co- 
efficient. If we now add the n item-test coefficients we have 


VYru 


i=1 k=1 
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The sum of the item-test coefficients is simply the positive square 
root of the sum of the item intercorrelations. 

Writing equation (8) in terms of the respective average coeffici- 
ents we have 


Nit = +- NV WM Ti 5 
where 7 means the average of the respective 7’s. 
This is simplified to 
or 
Fo ms (9) 


Substituting 7;, in the Spearman-Brown Formula, we may write 





R— i: (10) 
1+ (n—1)rix 


where R is the reliability coefficient. Solving for Tix, we have 





_ R 
rin = ——————_ - (11) 
n—nk+R 
Also, from (9) and (11), we may write 
= R 
jum n—mk +R nad 


Equation (12) gives a direct solution for the mean item-test co- 
efficient. Either the mean item-test coefficient or its square may be 
used as a measure of the cohesiveness or purity of the test. If equa- 
tion (10) is used to compute the reliability coefficient from the mean 
item-test coefficient and the number of items, the estimate of the re- 
liability coefficient will not be subject to the fluctuations in the value 
of R which are due to the arbitrary samplings of items to get the two 
split-halves. These fluctuations may be considerable in magnitude for 
different split-halves when the test is short. (2). 

From the foregoing equations, it is possible to conclude that: 

The rejection of items with low item-test correlations raises the 
reliability of a test, if the number of items is held constant. Whether 
the reliability coefficient will be raised absolutely, even with a re- 
duced number of items, depends upon the dispersion of the original 
item intercorrelations. If this dispersion is great, extending to a num- 











M. W. RICHARDSON 73 


ber of negative values, it is theoretically possible to attain a higher 
reliability with a smaller number of items. 


True Variance and Item Intercorrelation 


An alternative way of expressing the relationship of item inter- 
correlation to reliability is here given for its illustrative value. The 
true variance can be expressed in terms of the number of items, their 
common standard deviation, and the average item intercorrelation. 
The test variance may be written: 


of = No? + n(n—1) Tix oi? , 
which may be simplified to 
o:? no? (1+ (n—1) Tix] . (13) 


Equation (13) is simply another way of writing equation (5). 
Since the true variance is given by 


o *=Ro;? ; (14) 


a 


we obtain by substituting in equation (14) the estimates of R and 
o:2 from (10) and (18) respectively, 


oh hae N Tix «Rei 1+ (n—1) 7; ° 
© 44-5 = 


This can be simplified to 





$= oer ip . (15) 


The conclusion is that: 

For tests of homogeneous difficulty and constant length, the true 
variance is proportional to the average item intercorrelation. 
Empirical Verification 

It is hardly necessary to verify equation (7), since the verifica- 
tion must consist essentially of numerical substitution into each of 
two cognate algebraic formulas. Nevertheless, the following data are 
presented. Twenty-five objective items were selected from a long 
achievement test, in a completely random manner, except that they 
were of approximately the same difficulty. Table I gives the difficulty 
distribution of the items. 

The mean score of the 100 subjects on the 25 item test was 9.36; 
the standard deviation was 4.24. The item-test correlations were com- 
puted by use of the formula for the point bi-serial coefficient (the 
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M,—M, — ° 
Pearson 7) 7 = eral Vpq, where M, is the mean score of 


those passing the item, M, is the mean score of those failing the item, 
« is the standard deviation of the distribution of scores, p and q are 
the percentage of correct and incorrect answers, respectively. (7). 


TABLE I 


Percentage Number 
of correct of 
answers items 
35 
36 
37 
38 
39 
40 


The item intercorrelations were computed according to the for- 
mula 


aS OD SS 


a Pi2 — Pipe 
V Pi9i:P242 


where p, — the percentage of population who give correct response 
on the first item, 

p. = the percentage of correct response on the second item, 

Qa=1—p., 

G2=1—p, , 

P12 = percentage of the population who give the correct response 
to both items. 


Table II displays in parallel columns the two independently com- 
puted values of the item-test coefficients of correlation. 
Summary 


The foregoing development indicates that the reliability of a test 
may be improved by the use of the procedures of item analysis. Fur- 
thermore, such procedures will tend to make the test more pure or 
homogeneous, in the sense of conserving those items which have the 
largest intercorrelations. This is the only sense in which it may be 
said that the conserved items are more “valid” than the rejected 
items. (8, 10). 

The use of item analysis procedures of the type described does 
not necessarily select items whose sums will give the best prediction 


> 











M. W. RICHARDSON 75 














TABLE II 

Item-test Correlation 

Item Computed by the 

Number formula | Computed by 
M,—M, moe : 
‘ann p a _ | equation (7) 
1 424 | A424 
2 | 510 | 517 
3 | .289 | .287 
4 373 | .376 
5 157 181 
6 .370 | 1385 
7 .285 | .285 
8 144 .148 
9 | .262 | .254 
10 | .189 | .202 
11 | 456 | ABA 
12 | .080 | .079 
nC 564 | 561 
14 | .328 | .326 
15 | 436 | 438 
16 l 515 | 514 
17 | 214 | .218 
18 | .388 .387 
19 | A16 A21 
20 | .280 .284 
21 | A12 410 
22 | A81 ATT 
23 | 312 .309 
24 | .274 .272 
25 | 559 556 
| 














Average discrepancy = 1.31 per cent of first computed value. 


of an external criterion; Horst’s Method of Successive Residuals is a 
solution of this problem. (3). 


The University of Chicago, 
Chicago, Illinois. 
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ON THE USE OF MATHEMATICS IN 
PSYCHOLOGICAL THEORY* 


J. F. BROWN 


I. INTRODUCTION 


For the readers of a journal devoted to the application of mathe- 
matics to psychological research it is scarcely necessary to spend much 
time in answering the question, “Why mathematics?” Science may 
best be defined as that set of postulates regarding experience to which 
the universal assent of competent observers may be obtained, plus the 
organization of such postulates into theories for which universal as- 
sent is likewise obtainable. Of all the propositions about nature those 
concerned with mathematics are most readily given universal assent. 
From this state of affairs the Kantian aphorism that a discipline is as 
scientific as it contains mathematics is entirely consequent. Kant 
himself doubted the applicability of mathematics to psychology and 
so was led to question the possibility of a scientific psychology. Fech- 
ner, as is well-known, thought differently and from Fechner’s day the 
application of mathematical procedures has become an increasingly 
important part of psychological research until to-day we have a jour- 
nal devoted to such application alone. 

What reputation academic psychology} has with the educated 
layman depends almost entirely on the researches of the line of dis- 
tinguished men, who following Fechner have attempted to apply the 
precision which accompanies mathematical thinking alone to psycho- 
logical problems. Thanks to them, we may determine as much about 
an individual’s intellect in an hour and that probably more accurately, 
than a teacher’s subjective estimate furnishes us in a year. We may 
decide which of our children are absolutely unsuited for a college or a 
musical education. From psychophysical research of a mathematical 
sort we may often proceed to the decision of important problems of 
neurophysiology. But despite these many advantages there are good 
reasons for believing that the application of mathematics has so far 
helped but little towards making psychology a systematized science. 
The promises of the early work of Fechner and Binet have not been 

*This paper contains in rather abstract form certain arguments of the 
author’s monograph, “The Mathematical Conception Underlying the Theory of 


Psychological and Social Fields.” The monograph has been privately printed in 
a preliminary form. The whole monograph will be published in the near future. 


+As opposed to psychopathology, psychoanalysis, etc. 
- 
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fulfilled. The question of what, if anything has been measured and 
how these measurements are related to psychological theory is still 
an open one. Thus Thorndike, himself a leader in psychometrical re- 
search, writes: “Existing instruments represent enormous improve- 
ments over what was available twenty years ago, but three funda- 
mental defects remain. Just what they measure is not known, how 
far it is proper to add, subtract, multiply, divide and compute ratios 
with the measures obtained is not known; just what the measures 
obtained signify concerning the intellect is not known.’’* 

A systematized science like modern physics uses mathematics in 
making measurements as psychology has attempted to do, but an 
equally important application of mathematics in physics is to the con- 
struction of theories. In any advanced science most of the measure- 
ments performed depend on a close integration of theory, law, and ex- 
experiment. The older views of the scientific method which supposed 
that measurements le>d to laws through the discovery of correlations 
between sets of measurements on different entities have been shown 
to be unsound. In actual scientific practice the theory leads to the 
law and the law to the possibility of measurement more often than 
measurement leads to laws and hence to theories (1). The psycholo- 
gist in his attempt at an empiricism, based on what he supposes to be 
a sound mechanistic methodology, has neglected the possibilities of 
applying mathematical procedures to the construction of psychological 
theory. Psychologists have made wide use of mathematics in measure- 
ment, but have scarcely ever used mathematical concepts in theory- 
buiiding.+ The purpose of this paper is to call to the attention of the 
mathematical psychologist certain mathematical procedures which 
may be used in the construction of psychological theories. Lack of 
space prevents the mathematical development of these concepts. The 
various references, however, should enable the reader to pursue the 
mathematics of this mode of attack further should he so desire. 

Before doing this, two questions must be answered briefly. “Must 
we have theory in psychology?” and “If so, what must the nature of 
the theory be?” The first question is to be answered with a strong 
affirmative. All science is based on theoretical postulates of some 
sort. Those individuals like the Watsonian Behaviorists who have 
denied theory the most emphatically have been adherents to a very 
naive type of positivistic materialism. They have further implicitly 


*Thorndike (19). Althought this judgment is eight years old, it could well 
be repeated to-day. I have gone into my reasons for questioning the systematic 
value of much psychometrical research in a separate paper. (1) 

+There are some notable exceptions, of course, like Spearman and Thurstone. 


It is quite a different approach which will concern us here. 
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accepted atomistic-mechanism and this theory has frequently dictated 
to them what the “facts,” of which they have made so much, should 
be (4). The theory is there but it is not properly co-ordinated with 
the research. Hence, despite the methodological attractiveness of the 
strictly molecular behavioristic position, the comparative sterility of 
the results. Individuals like Hull (11) and Tolman (20) have clearly 
seen the necessity for theory even for the Behaviorists. 

Recent methodological research has also shown us the most 
fruitful type of theory. The theory should be based on what may be 
called the hypothetico-deductive method, or the method of constructs. 
In this method, hypotheses are devised to account for the descriptive 
data and from these hypotheses, predictions are made which may be 
tested in experiment. The constructs used in the hypotheses must be 
capable of operational definition. They must further lead to theoreti- 
cal postulates which may be tested in critical experiments. There is 
so much agreement now amongst methodologists on this point that to 
argue it further would require space which may better be spent on the 
development of the constructs themselves.* 

Arithmetic and algebraic concepts find their chief application to 
science in measurement. For the building of theories geometry is of 
greater importance. Consequently, the following sections of this pa- 
per will introduce certain geometrical conceptions applicable to 
psychological theory and give reference to their use in the investiga- 
tion of concrete problems by the hypothetico-deductive method. 


II. THE CONCEPT OF THE PSYCHOLOGICAL FIELD. 


In many ways the most important theoretical construct of modern 
physics is that of the field.+ The idea of psychological fields has been 
widely but somewhat loosely used by psychologists. It is easy to un- 
derstand that a construct which has been so fruitful for physical re- 
search should be adopted by psychologists at the time when psychology 
is changing from an Aristotelian to a Galileian science (13). It is to 
be regretted, however, that many psychologists in using the construct 
of the psychological field have failed to give it a precise mathematical 
definition. 

The psychological field is a space construct to which descriptions 
of psychological behavior may be ordered. Space is a manifold in 
which positional rei tionships may be expressed. In general the mani- 

*Cf. the papers of Lewin (13) (14), Carnap (S) (7), Brown (2) (3) on 
this point. The paper of Brown (3) gives considerable attention to the views of 


other methodologists. 
+Technical mathematical concepts will be italicized on introduction. 
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fold may be continuous or discrete, and position may be defined in 
terms of distance and direction as in Euclidean space or only in terms 
of relation as in topological space.* Geometry, despite the actual deri- 
vation of the word is concerned with all possible logical constructs 
about space. Since Riemann’s great paper (18), spaces may be con- 
structed of any dimensions and properties, provided these are logically 
consistent. Furthermore, Riemann showed that the properties of a 
space may be dependent on the dynamics of processes within that 
space. Consequently, metrical or topological fields may be in princi- 
ple of the same nature as the electromagnetic or gravitational field. 
This relationship between the properties of physical space and the 
dynamic processes within it is one of the most important of modern 
physics. Recent work in psychological theory indicates a similar re- 
lationship between the spatial properties of the psychological field 
and the psychodynamical processes (15). In our definition of the 
psychological field as a space construct, space must be understood in 
its post-Riemannian sense. The properties and dimensions of the 
psychological field will be more precisely defined after a consideration 
of its general nature. 

Every sample of human behavior may be analyzed physically, 
chemically, biologically, physiologically, psychologically, sociologically, 
perhaps also ethically. I refill my fountain pen. The physical analysis 
of such an event would describe the energy exchanges in terms of 
mechanics, (possibly in terms of the changes in atomic structure) 
which occurred as my hands executed the movements necessary for 
this act. Chemical analysis would be concerned with the chemical 
changes attendant upon it. The biologist would treat the activity as 
a problem in ecological adaptation. The physiologist would concern 
himself with the changes in the bio-chemistry of my body during the 
behavior. To the psychologist the behavior is analyzable as an exam- 
ple of goal-integrated activity. The sociologist would be concerned 
with the possible results of the act in the social group to which I be- 
long. The ethicist must decide as to whether I have done right in fill- 
ing my pen in order to write the lines which you are now reading. 
Any analysis of the behavior requires abstraction of certain of its 
aspects. To describe the physics of the act the physicist makes use of 
the construct of the gravitational field; psychologically the act may 
best be described as occurring in a psychological field. Statements 
like “the rat is hungry and trying to get the cheese,” “I am attempting 
a clarification of psychological theory,” are to be ordered to vectors 
within psychological fields. The psychological field is a construct to 


*Topology as a branch of geometry will be considered shortly. 
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which all psychological activity, (i.e., behavior) may be ordered. It 
is spatial in the sense in which space has been defined above. 

The idea of psychological field may perhaps be clarified by com- 
paring it with mathematical and physical fields. Mathematical fields 
are spatial regions which may be either scalar or vector fields. A 
scalar field is a region where every point may have an associated set 
of magnitudes. A vector field is a region where every point is charac- 
terized by both direction and magnitude. Physical fields, as for in- 
stance force fields, have every point characterized by a vector, which 
represents the potential at that point. The points in the psychological 
field are associated with both direction and magnitude but these may 
for the present only be non-metrically defined. The behavior of an 
organism may be said to be directed towards a goal. The force behind 
the behavior may be said to have a magnitude. The magnitude may 
have an index-figure assigned to it.* Whenever an organism behaves 
psychologically, it may be said to be behaving in a psychological field. 
The goal which it is “trying” to find is to be ordered to a point with- 
in this psychological field. The force which is causing the behavior is 
to be ordered to a vector within this psychological field, as is its pres- 
ent position. 

For first analysis a two-dimensional plane suffices as an adequate 
construct for all psychological behavior problems. (A one-dimensional 
manifold would not be adequate, because we would then have no pos- 
sibility of ordering behavior which was not in the simple direction 
towards or away from the goal.) In the language of data, there is a 
rat (aman), which (or who), is trying to get cheese ( a solution to 
a mathematical problem). In the language of constructs, there is a 
vector in the psychological field, activating the rat (man) towards the 
goal (cheese or the solution of the problem). Both organism and goal 
are to be ordered to positions in the psychological field. The force 
(language of constructs), to which the behavior, (language of data) 
of both is to be ordered, represents a directed magnitude. The value 
of this vector depends on its position in the field. It is well known 
that when the goal is nearly attained the magnitude of the vector is 
greater. From this one can conclude that the co-ordinates to the 
points in the psychological field have magnitude. But the magnitude 
which must be assigned to position within the psychological field is 
non-metricized. Point-values in the psychological field are not yet 
metricized in character, while those in physical fields are metricized. 

*It is necessary to introduce the concepts of both vector and goal at this 
point. The exact definition of these will be given later. Since psychological forces 


are not measureable in fundamental terms, we speak of index-figures rather than 
measurements. Cf. below. 
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The direction of vectors in the psychological field may be defined for 
certain problems through the distinctive path between points within 
the field.* Consequently the chief methodological difference between 
the psychological field and the physical field is that direction and 
magnitude of the point values within the psychological field are not 
as yet to be given with the same precise definition. When bodies be- 
have physically or psychologically this behavior may be ordered to 
the construct of the physical or psychological field. The physical and 
psychological field both represent spatial constructs. Every psycho- 
logical activity may be ordered, for first approximation, to a two-di- 
mensional plane (a surface), where organ’ *m and goal represent cer- 
tain spatial regions within the surface. The surface must be treated 
as a topological rather than a metricized field at the present time. 
It is mathematically possible to create as many additional dimensions 
to this continuum as are necessary to enable us to treat adequately the 
psychological descriptions of the language of data. 

By field-structure we shall mean the variations in precision with 
the position of points in the psychological field which may be given. 
Following Lewin (15) we shall call fields unstructured where it is im- 
possible to give the position of (i.e., to distinguish) points. A field is 
said to be structured when one can distinguish large regions, but not 
infinitely small regions within it. When one can distinguish infinitely 
small regions or points within a field it is said to be infinitely struc- 
tured. The degree of structure refers to topological, i.e., non-metri- 
cized, fields. Only metrical fields are infinitely structured. So only in 
psychological problems where we are concerned with actual physical 
locomotions of the subject (the rat in the maze, for instance) is the 
psychological field infinitely structured. We can precisely define goal 
and initial position. For the chief problems of human psychology 
(the mathematician solving the problem, for instance) the field mav 
be said to be structured but not infinitely structured. For this reason, 
except for the simplest problems, like maze running, the space of the 
psychological field must be treated topologically rather than metri- 
cally. 


III. THE TOPOLOGICAL VARIANTS IN THE STRUCTURE 
OF THE PSYCHOLOGICAL FIELD. 


Topology (analysis situs) is defined by v. Kerékjart6 (12) as 
“that part of geometry, which investigates the properties of figures 
which remain unchanged under continuous transformation. These 


*Lewin has accomplished this in his recent paper on hodological space (15). 
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are the relationships of connection and position, properties of a quali- 
tative nature.” The transformations admitted in topology are arbi- 
trary point-to-point transformations. Topology investigates the non- 
metrical aspects of space, particularly the possible connections be- 
tween different spatial regions. It should be quite obvious that there 
is relationship here to such modern psychological conceptions as 
gestalt, configuration, belongingness and membership-character. For 
psychological purposes one might define topology as the science which 
investigates the “belongingness” of spatial regions, and their connec- 
tivity with other regions. 

Like the theory of probability, topology grew up as a mathemaiti- 
cal step-child. Just as Galileo and Laplace amused themselves with 
the formulation of probability postulates but considered them of little 
real importance for science, so Leibniz and Euler played with the ideas 
of topology. Riemann’s investigations on the connectivity of surfaces 
however showed the importance of topology for the theory of func- 
tions and since that time, topology has been granted a place as repu- 
table mathematical science. 

Poincaré in 1895 first attempted a mathematical foundation for 
general topology and since then a great many of the ablest geometri- 
cians have concerned themselves with its problems.* Today there is 
a great body of proven topological theorems and topology is applied 
in physical and psychological problems. Furthermore topology has 
been given a firm mathematical foundation in the theory of abstract 
sets. The introduction of the theory of abstract sets has been charac- 
terized by Fraenkel (8) as creating “a scientific revolution in mathe- 
matics, of not less importance than the Copernican system in astron- 
omy, than the Einsteinian relativity theory in physics.” 

Topology becomes a metricized geometry when direction and 
magnitude of topological concepts are defined. A circle, an ellipse and 
any polygon are topologically equal. So are a cube, a sphere and any 
closed three-dimensional figure. Topology investigates those spatial 
properties which are independent of metrics. For instance, any closed 
curve lying in a plane (the topological Jordan curve) has many such 
properties which have been handled mathematically by topologists. 
It can be proved that the Jordan curve divides a surface into two re- 
gions, of which the curve is the common boundary, that the Jordan 

*For the history and generai references, cf. v. Kerékjart6 (12). For the 
American psychologist who wishes to orientate himself in this science it is diffi- 
cult to recommend general texts. Topology has several branches, of which the 
most important for psychology is surface topology. The brief article of Franklin 
(9) refers to the chief textbooks and introduces the simpler concepts. The theory 
of point-sets is presented by Fraenkel (8), who gives adequate references to the 


works of others. The forthcoming work of Lewin (14) includes a brief topological 
introduction. 
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curve defines at the most one limited region, that it is impossible to 
move from the inner limited region to the outer region without cross- 
ing the curve.* At the present time hundreds of such demonstrations 
are possible. 

Topology is in many ways to be looked on as the basic science of 
space. With topology geometry becomes truly the science of positional 
relationships. Since relational thinking tends to structure itself in 
terms of spacial relationships, topology gives us the mathematics 
necessary to set up theories about psychological problems, where 
fundamental measurement is impossible at the present time.} 

We are now ready to introduce the topological concepts applica- 
ble to psychological research. Any segment of space represents a 
region, and all spatial configurations (or figures) are regions. A point, 
a line, a plane, and a solid are regions of respectively 0, 1, 2, and 3 di- 
mensions. Points may be taken as topologically given, or they may be 
defined as the limiting case where n closed curves are so constructed 
that each succeeding curve lies within the boundaries of the one pre- 
ceding it. In the following we will speak of point-regions as those 
segments of space which will be treated mathematically as points. For 
the first approximation of many problems the individual may be 
ordered to a point-region in the psychological field. Similarly the goal 
may be ordered to a point-region, when the goal is clearly definable, 
i.e., Where one can give its exact position relative to the subject. This 
is by no means always the case for psychological activities. In cases 
of actual physical locomotion (all sorts of problems of mazes and 
circuitous routes), the spatial definition of the goal as point is rela- 
tively easy. When the goal is the solution of a mathematical problem 
or the attainment of a social status its definition is more difficult. A 
line-region connecting two points is called a path. Psychological ac- 
tivity of all sorts will be ordered to a path and may be said to repre- 
sent a locomotion in the psychological field. Thus a successful running 
of a maze represents a locomotion along the only path connecting the 
starting-position with the goal position. One of the chief problems of 
surface topology is the connectivity of certain points through certain 
paths and the problem of defining through what regions the path 
must run in order that the locomotion be attained. Spatial regions 
are said to be incident when it is possible to construct a path from a 
point in one to a point in the other without crossing any other region. 

*Any readers, who are so mathematically naive as to consider such proposi- 
tions unnecessary of proof, are reminded of the history of the parallel axiom, 
which seemed equally self-evident. 

+In fundamental measurement the arithmetic theorem of addition holds for 


the numbers involved so scales may be established with equidistant units and a 
zero point. Cf. Campbell (5). 
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The problem of incidence of regions is of psychological importance 
for defining the possible locomotions between the individual and the 
goal. Regions are further characterized as bounded and unbounded, 
limited and unlimited, and one-fold, two-fold, to n-fold connected. The 
properties of such regions will be given in the following lines. These 
characterizations do not pretend to be precise topological definitions 
as it is necessary to introduce the concept of direction which has no 
topological meaning. It is believed, however, that for purposes of in- 
troduction such an approach will better clarify the concepts as used 
in actual psychological research. Whether or not a region is limited 
or bounded and its connectivity may also be determined by pure topo- 
logical methods. 

A region in which a point-region continues locomotion in a given 
direction indefinitely without return to its initial position is unlimited. 
If. the point-region returns eventually to its initial position the re- 
gion is said to be limited. A region in which locomotion of a point- 
region in a given direction must eventually bring it incident to an- 
other region, (the boundary) is bounded. If the point-region does 
not eventually become incident to another region, the region is said 
to be unbounded. A region in which any point may be connected to 
any other point by at least one path, so that the path becomes incident 
to‘no other region, is connected. A simply or one-fold connected region 
is one which may be divided into two separate connected regions by 
any cut through the region. Such a cut divides the region into two 
regions so that every point within the original region belongs to 
either one or the other of the new regions but not to both. A two- 
fold connected region requires under conditions two such cuts to cre- 
ate two simply connected regions; a three-fold connected region three 
such cuts, and an n-fold connected region » such cuts. These con- 
cepts may perhaps best be illustrated by reference to Figure 1. 

The curve A alone represents an unbounded region i.e., the two- 
dimensional topological plane. (The curve is broken to indicate the 
lack of boundary. It is of course necessary to draw it so, because the 
page itself is bounded.) The contours B, D, H, L, J, all define limited, 
bounded regions, and as contours are topologically all equal. C and 
CU, represent point-regions, and the lines connecting them are paths 
between them. All of these paths are topologically equal. B is one- 
fold connected, as one cut E may be constructed through the region, 
dividing it into two simply connected regions B, and B.. D is two- 
fold connected. D remains a simply connected region after the con- 
struction of the cut F through it. In order to create two simply con- 
nected regions, the cut, G, must also be constructed. All the figures 
lie in the unbounded, unlimited region A. They may be said to be 
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constructed in it. After such constructions have been made the space 
may be said to be structured. Space is infinitely structured when one 
can distinguish infinitely small regions within it. Hence K may be 
said to represent a region that is infinitely structured, in that Car- 
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FIGURE I 


tesian coordinates may be constructed within it and the x, y, values 
of any point within it may be given. L on the other hand is struc- 
tured, but we may only say that the point lies within both boundaries, 
i.e., only its topological position may be given. In unstructured space 
we can say nothing about the position of a point. At the present time 
in psychology the space to which we assign most of our data may be 
said to be structured but not infinitely structured. (Cf. Lewin, (14) 
(15).) 

The reader is perhaps by this time anxious to see how certain 
psychological data may be ordered to the topological concepts. The 
individual is ordered to a point-region in the psychological field. Sup- 
pose an individual is on the playing field of one of our modern Ameri- 
can athletic stadia, such as the Yale Bowl. If all the exits are blocked 
the individual’s actual physical locomotions occur in a limited, bound- 
ed region. The region is furthermore, one-fold connected because a 
barrier across the field would divide the field into two simply con- 
nected regions. (Cf. B in Figure I.) When the individual, either on 
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instruction of the psychologist or on his own initiative, moves, his 
direction of movement is indicated by the vector v, which is a force 
directed towards the goal g.* In, this case if one of the exit doors of 
the stadiam is left'open the field becomes unbounded and the in- 
dividual may leave it. 

The example just given is one of actual physical locomotion and 
the properties of psychological space may be directly ordered to physi- 
cal space, so that the physical correlates of the subject’s locomotion 
may be given. Although many of the specific problems of animal 
psychology are to be treated in terms of such a space, the chief prob- 
lems of human and social psychology require a more developed spatial 
concept. All behavior is to be ordered to locomotion in the psychologi- 
cal field. The goal may be, as we pointed out above, of the nature of 
attainment of a certain social status or the solution of a mathematical 
problem. An individual A, is a freshman student desirous of becom- 
ing a member of a certain fraternity. In this case he is to be ordered 
to a position in space outside the region to which members of this 
fraternity belong. The situation topologically is given in Figure II. 


Le oo 





FIGURE II 


A represents our student, B and C together the members of the fra- 
ternity. A wishes to get into the region B and C. In order for him 
to become a full-fledged member C, he must go through the pledge 
region B. Psychologically, in terms of the language of data A wants 
to become a member of the fraternity. We order this situation to a 


*The vector concept is a non-metricized dynamical concept and will be pre- 
cisely treated shortly. 
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field in which B and C represent bounded limited regions, where C 
lies within B. It is impossible to make the locomotion from A’s origi- 
nal position to C without first attaining membership-character in the 
region B. In this connection the boundaries surrounding both B and 
C must be crossed.* 

A similar situation arises in the case of problem solution. If the 
individual is trying to solve the Pythagorean theorem, he must, in the 
language of data, make certain Euclidean constructions in order to 
arrive at the proof. Topologically the situation is that given in Fig- 
ure III.+ It is necessary to go through regions B and C to get to the 





FIGURE III 


goal D. Not all individuals may fulfill this locomotion, and the “ease” 
with which the barriers are crossed distinguishes a good geometry 
student from a poor one. An individual is to be located in region B, 
if he has gone so far in the solution of this problem that the first con- 
structions are made. 
Regions in the psychological field are marked off by boundaries. 
Boundaries have been topologically defined above. The psychological 
significance of boundary is that in crossing a boundary the individ- 
ual’s reactions are changed. Our freshman student behaves differently 
after he has become a member of the fraternity. Our geometry stu- 
dent’s consciousness about the Pythagorean theorem as a problem is 
differently structured after he sees the first steps in its solution. So- 
ciologically all the members of any organized group are to be crdered 
to a bounded region. Belonging to the group gives the individual cer- 
tain psychological characteristics which differentiate him from non- 
members. Individual point-regions within a bounded-region are said 
to have membership-character within that region. We shall see short- 
ly that the dynamics of the field determine the variation in member- 
*The exact definition of boundary and membership-character will be given 
me convenience in drawing we will from now on discard the indication that 


all our constructions are in the two-dimensional topological plane and simply 
indicate the field of activity as a bounded region. 
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ship-character amongst the individuals. All Catholics have member- 
ship-character in the bounded region to which the members of the 
Catholic Church are ordered, all Unitarians in the bounded region of 
the Unitarian Church. There is more variation in the Unitarian mem- 
bership-character than in the Catholic. In other words, a greater lati- 
tude of opinion on matters of religious dogma is allowed in the Uni- 
tarian Church. All the individuals within a bounded social region are 
affected in their behavior through the fact that they have membership- 
character within this region. The boundary may be said to be quasi- 
physical, quasi-social, or quasi-conceptual.* Quasi-physical, are bound- 
aries like prison walls and club-buildings, where membership-char- 
acter is marked off by an actual physical boundary. The quasi-social 
boundaries are those where social institutions and mores mark off the 
regions. The quasi-conceptual are those where the intellectual fac- 
tors function as boundaries. 

Psychologically a boundary represents a barrier to locomotion. 
This barrier is not necessarily impenetrable, but in crossing it, the 
point-region (individual) becomes ordered to a new social region 
and his “psychology” is changed. It is convenient to distinguish be- 
tween two types of psychological barrier, both of which represent 
topologically bounded social regions. In the following, group-barriers 
will be used to designate the limiting regions of social groups, inner- 
barriers to indicate blockages to locomotions within a given social 
region. Barriers may be quasi-physical, quasi-social, quasi-concept- 
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FIGURE IV 


























ual. The mores and social institutions, as well as actual walls and 
fences are to be ordered to barriers in psychological space. 

Figure IV gives the barrier characterization of the proletariat 
and bourgeoisie as social groups. P represents the proletariat group 
barrier, B that of the bourgeoise. a, b, c, d,e, all represent inner 

*The prefix “quasi” indicates simply derived from physics, from sociology, 


from logic in these terms. We are interested only in the psychological effects of 
these physical, sociological and logical entities. 





90 PSYCHOMETRIKA 


barriers within these group barriers. The detailed characterization 
of such barriers may only be given in connection with the non-met- 
ricized dynamical concepts now to be introduced. 

The two-dimensional manifold allows us to treat of all initial 
positions and goals and consequently to give the topology for any 
psychological activity considered by itself. It has long been realized 
by psychologists, however, that there are decided differences in such 
activities as perceiving, thinking, dreaming and day-dreaming. The 
differences between such activities necessitate treating problems of 
individual psychological acts in a three-dimensional manifold. The 
introduced third dimension has been called the reality dimension of 
the personality. It is necessary to introduce this third dimension be- 
cause of the structural differences in activities which may occur prac- 
tically simultaneously. For normal perception is said to have a higher 
degree of reality than thinking, and thinking a higher degree of re- 
ality than daydreaming. Thinking or even daydreaming may under 
circumstances, however, have a higher degree of reality than perceiv- 
ing (16). The same goal may be perceived, thought, or dreamt about. 
Consequently the degree of reality is the third dimension of the psy- 
chological field. It is topologically treated as in Figure V. (The re- 
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FIGURE V 


ality dimension is continuous. In the diagram only the field structure 
for two planes in the continuous dimension are shown. The barriers 
in the plane of lesser reality are indicated with broken lines to show 
their greater dynamical permeability, of which we will speak in the 
next Section.) 


(To be concluded in next issue) 




















